Mega-fanfic archive dreams

Mega-fic archive wished for.

Scalability is the item on the wishlist that is the hard one. Or rather, the one that requires the brains from whatever poor sap they coax into writing it for them. Those guys need a programmer who doesn't suck in the first moments that the project exists. I don't know that they understand what it means to scale the way they're talking: bandwidth costs, CPU. Disk is at least cheap... The architectural choices need to be made by a programmer who's done this before. The tempting early mistake is putting it all into an sql database; this mistake is fatal for scalability. All the work is in that initial data design.

Goes without saying that you'd open-source it.

The initial feature list includes more items about social issues than about software specification, yet the few that are about the software include some mighty assumptions. Modulating down the expectations to something satisfiable given the constraints of the project would be a trick.

The political fu involved in volunteering on something like that would scare me off. Huge demands + huge wankage + huge pressure + no money. The first thing I'd do is build a very very very high wall around myself so I wouldn't have to put up with the chattering classes. Then fling frequent iterative releases over the wall, XP/Agile style. Often programmers desperately need more interaction with their customers, so they can better understand what they want. In this case, the programmers would need insulation from the customers. Note the trend in the comments: "I'm not a 'coder', but I would like it to..." Have rainbow ponies, do my laundry, save me from having to categorize my deathless prose in any way yet magically make it easily findable by category-driven readers... but the commenters probably don't realize that's what they're asking for, because they don't have the domain knowledge. Some translator needs to tell them so, nicely.

The key insight is that the audience for this archive is the reader. Not the writer. Some of the odder writer-centric demands need to be politely ignored in favor of serving the reader as well as possible.

Finding stories is the major task. By fandom, pairing, author, genre, content, tags, "readers who liked this also liked....", however it works. That's the task an archive has to do well. Oh yeah, and serve those millions of stories to the hundreds of thousands of daily visitors without choking. That too.
  • Current Music: Count It Out : Erik Satie & Dave Brubeck (mixed by Alex C) : Erik Satie
Tags: ,
WORD. it's a lovely idea in the pie in the sky phases, but realizing those dreams?

one of the things i'm most grateful for was being employed as a receptionist type person in small companies during the end of the boom/into the bust, because i learned first how to listen to our programmers/engineers/it people without having the working hanging over our heads (in the sense that i wasn't a project manager or a sales person - so i learned details about projects without needing to get stuff done)... and then coming to realize that 98% of the time the sales/top level project managers/execs/marketing depts don't really have a clue even how to *listen* to the people who work on creating the systems that help everybody else do their job. And sure, the reverse is true too. It was fun (in that 'wow-this-is-sometimes-not-even-the-same-language kind of way) being the mediator.

so, question - you bored and looking for projects?
Yeah, the translation definitely needs to go both ways! Programmers to marketing/sales people (or to an even better representative of their end user). And the non-technical staff back to the technical staff. Every profession has its jargon, which evolves so that people can communicate with each other more quickly. Technical professions are the worst; there's a huge ton of specialized stuff that the programmers know that is not easily communicated outward.

Good project managers can do this and are worth their weight in gold.

I am moderately bored, and somewhat idly seeking a new hobby programming project. (Not one the size of that archive! For one thing, I don't have the scalability experience they'll need. I know enough to know I don't know how to do that project.) An interesting project would tempt me, if it weren't too huge. Can't promise much free time to devote to it until after my first SOG day on June 21 ;)
Well, I'm not a programmer, but I took one look at the first couple sets of ideas for what the proposed archive should do...and I said to myself that if I was a programmer, I'd be ducking and covering about then.

'Hi, we want the universe on a silver platter with watercress and caviar around it for nothing with no problems or hiccoughs ever...and a pony!' Yikes!
Oh yes. No volunteering on this one for me. It would be beyond thankless and right in the territory of soul-sucking.

What I would do if I were cannier (or maybe a tad less emphatically employed than I am at the moment) is take that list and go away and write something that's what they secretly want, though they don't know it. It is an interesting problem to solve, if you take away all the drain of interacting with the people who don't realize that what they want is incompatible with everything else they want. (Or who don't realize that it's not a political problem. Asking more emphatically doesn't change reality. I've seen this in people I've worked with. "If only I yell, it'll make these people tell me I can have it in three weeks instead of this horrible three months they're saying now.")
Pardon my density, but why do we need this? I mean, we really need all fandoms, everywhere, to be united on one site? Did I miss something? I must have, I'm always missing something like that.

Don't most fandoms already have a main archive site? Isn't that the point -that we don't have to wade through loads of crap to get to our favorite authors/challenges/pairings/etc?

I get that some of those features would be nice for any archive, but I don't see why it's beneficial to have all fandoms on one site. That just seems...mind boggling.
Imagine only sensibly designed.
Imagine LJ extended in a direction that assumes the content posted is stories.
Imagine that fandom is making a pre-emptive strike against non-fans moving in and filling unmet needs. (There is this thing I haven't bothered to look at that has the meta-people in a tizzy, Fanlib or something.)

I've just recently been through the New Reader / Visitor From Mars pattern, and I think finding stuff, particularly recent stuff, has been made much more difficult by LJ's takeover of much fandom interaction. Google helps, but then you have weird writers turning off web spidering for their journals so nobody can find them who isn't already fully plugged in. Improved access to content is a worthy goal.
here via metafandom
Often programmers desperately need more interaction with their customers, so they can better understand what they want. In this case, the programmers would need insulation from the customers.


It's a fantastic idea, but as a technical writer I've sat in enough kick-off meetings for Projects from HELL to see that they're going to need one hell of a project manager to pull the whole thing through.

I couldn't spot from the comments if they have any hard-core coders already on-board, but I'm worried that all those ideas - while good for brainstorming - may just frighten away the people who're supposed to make this thing do what it's supposed to do. Because I'm sure that when and if the archive actually happens, it won't have all those features that people want and there will be wank about why feature X was implemented and not feature Y (in the way of the eternal "why can't we have more userpics instead of the pnoneposts because I never use them" complaints at news posts).
Re: here via metafandom
Hello from metafandom, oh amazing Giles-icon-making one!

I completely agree that the people with the skills needed to write this successfully would be running far, far away after reading that thread. Lots and lots of competing interests in the feature suggestions, for one thing. And there will be wank. Sigh.
Thoughts on the problems of a large archive, or why that discussion bothered me... Part 1
The pie in the sky thing bothered me to a certain degree. I just did not feel comfortable discussing it. My own archive failed me but lots of details I'd prefer not to get in to. I had an 80 page implementation paper I wrote on how it should be done. It was well thought out before execution was started. Execution was the problem. In that degree, I envy FanLib as they have the money to do what I could not.

The discussion was interesting. In theory, what would everyone like? It demonstrated to me the large problem with such a site: There are too many opinions, too many conflicting opinions, too much claims of group ownership from people with very little ownership talked about, with who the buck would stop with. Ultimately, any project needs that person.

Behind the scenes on large archives, you have to deal with user bases who do not agree. One group is "No to adult fic." Another group is "Yes to adult fic." You may have moderators who are "Yes to heavy handed involvedness." Another set of moderators who abhore that approach. You're going to alienate some one. It is best to be prepared for that eventually before hand by having a clearly defined mission, clearly defined (for want of a better term) business plan with timelines. It is best to be open with what you're doing but prepared to be steadfast on some things. And all of that needs to be planned BEFORE the archive exists. It needs to be documented. Policies need to be set before hand to avoid those conflict and to help get the right staff members, the right programmers, the right contributors. It just makes sense. While fandom might not think that money should be involved, it does not mean that valid business practices should not be borrowed to deal with situations, especially ones on that scope.

The other issue I had with that discussion was market research. A number of things talked about were done by LiveJournal, JournalFen, Quizilla, RockFic, AdultFanFiction.Net, FanFiction.Net, FanWorks.Org, Fic Wad, Fan Lib, etc. None of those sites operates exactly the same way. None provide exactly the same service. Realistically, if you're doing a good business plan, you scope out the competition, find out what they offer, compare services, etc. You then talk about how your services will be different. What makes FanFiction.Net not workable? Why is there a need to duplicate what FanFiction.Net seems to have done? Or RockFic has done? Or AdultFanFiction.Net has done? Or Fic Wad has done? Before such an archive is created, that needs to be addressed. Call it a fandom type need analysis assessment and why the competition does not meet those needs.

Another issue is target audience. Who is the target audience for this archive? Is it ONLY LiveJournal users? Is it FanFiction.Net users? Is it MySpace users? What are there demographics? What are the user needs? A 15 year old new into fandom is going to have needs and desire than a 35 year old just into fandom than a 25 year old who has been in fandom for ten years. Is the demographic English only? Will the archive assume familiarity with fandom? This demographic thing is really important. I know some people on that discussion defined fan fiction fandom as only media fandom people, who traced their fannish lineage to Kirk/Spock, who had been involved in fandom for at least five years and who were over the age of 20. Others defined it as anyone who wrote fictional stories based on other things. TWO totally different groups, two totally different demographis. The same archive will not work for both and unless all parties have some idea as to who they are aiming for, there will be confusion (for users and moderators). Not good.

Another thing that bothered me was that people didn't seem to talk about the history of various fan fiction archives. They brought up things that for me, sounded like mistakes, mistakes other archives have made that either shut them down or had to deal with in order to continue. Some examples...
Thoughts on the problems of a large archive, or why that discussion bothered me... Part 2

There was a lot of discussion regarding legal situations, contacting lawyers, covering your ass to prevent being sued. In my limited knowledge, I have never heard of a fan fiction site ACTUALLY being sued. I have heard of, have recieved a cease and desist letter myself as a result of content on a fan fiction archive. For FanFiction.Net, they meerly prohibited users from posting that content. It was done as the creators contacted FanFiction.Net with legal threats or general requests for removal. Some of it was done in reaction to external legal issues. For FanDomination.Net, some content was prohibited by the Terms of Service. The Terms of Service was written with a legal and ethical understanding, and basically stated we would bend over backwards to comply with their demands to protect ourselves legally. (I left FanDomination.Net in August 2005. For disclosure's sake.) A lot of legal type stuff is not a result of TPTB meddling but of angry fen putting TPTB in a place where they have to act. And yeah. But at rate, I think the implied legal threat was over stated and ignores the history of fan fiction and of fan fiction archives.

The other issue is funding. There was a lot of talk about money and people offering it. It sounds great but historically, a fan ficton archive doing that, a lot of the larger multifandom automated archives just absolutely do not make money based on donations alone. Fiction Alley has a Cafe Press shop. The people involved with that have sunk a lot of their own personal money into it. FanDomination.Net, between April 2002 and August 2005, I put in over $3,500 of my own money in to the site. We asked, begged, cajoled people for money. We probably had donations that never topped $500 TOTAL. This when we had a user base of 67,000 active users at the time. AdultFanFiction.Net has had a lot of problems with paying for their hosting. They've disappeared for long periods because they got really behind on it. FanWorks.Org is smaller and makes no money. It is funded by its creator to show off to potential employers regarding his programming skills. RockFic has accounts. FanFiction.Net was at one point costing Xing around $2,000 or so a month to maintain. (This was back in 2000 when it was being run on a Cold Fusion server.) The site was kept afloat because Xing's employer allowed it for various reasons. If he had to fund it on its own, it would not have happened. Soup Fiction's server load was too big and closed down. They couldn't pay for it and users weren't donating. Lesson there, historically, donations have not been able to fund these archives because of the cost involved. And because, by the time the sites reach that size, users see them less as archives and communities but services and many web services they don't feel a need to pay for. Those historical lessons are important ones that anyone seriously contemplating an archive should consider.
Re: Thoughts on the problems of a large archive, or why that discussion bothered me... Part 2
I didn't need to read your case histories to guess that fans won't stump up at the rate they would need to to keep these sites going. :(

I think, for a number of reasons, this project won't move an inch without a wealthy bored retired-post-IPO programmer deciding to make it his or her hobby project. And throwing a bunch of money into it. Or you'd need to do a paid subscription model (sub or have ads on your pages), and they'd scream. Sadly they've been taught they can have things for free.

*Waves umbrella* Back in my day we mail-ordered fanzines for real cash!

Thoughts on the problems of a large archive, or why that discussion bothered me... Part 3
Another historical concern should be content. It is great to say "Archive everything!" and "Age checks but still allow minors to access materials!" But that has been done before to varying degrees of success. FanFiction.Net eventually canned the most explicit material. RestrictedSection had problems and had to password protect themselves because of the adult material. SugarQuill and other sites had to become COPA compliant. FanFiction.Net and other sites pulled songfic because of the legal situation in Germany where for every full set of song lyrics, some sites were being told to pay €1,500 or some number like that. SkyHawke found out the legal problems involving Chan the hard way because of laws in their country. Media fandom, especially book based media fandom, is more prone to legal threats then Real Person Fic. This has made some sites leery of media fic. RockFic really doesn't allow media fic type stuff in their stories as a result. They determined that RPF was really the most legal thing for them. Rock Fic also decided to keep out all the minors. And found mods to enforce that. They made a game of it. New users were screened and they were checked all over fan space to verify information. So things like adult content, songfic, RPF and Media Fic, Chan, all that content needs to have a policy position AND the decision should be made based on information on what worked, what didn't work for other archives... in addition to any more issues that the creators have.

Those are my big three history things ignored.

At the same time, you need to find some one to be in charge. Buck stops here. That person should be the project manager and needs to be in charge of the programmer or programmers. You can't just have 30 people volunteering to code and letting them have access to your code to work on. That would be a nightmare from hell. The work would need some one in charge, need to have whatever database planned out well in advanced (and planned to be scalable) because it will need to be modularized. And tested. And tested again. Seriously: Lose coding could crash that site.

But first, you need a person in charge, need to know your money constraints, need to look at the pool of programmers who are willing to commit to it, need to find a host that SOME ONE can PHYSICALLY access. FanFiction.Net, FanDomination.Net, Slash City, all those have people in charge who can physically touch the server they have. (It also means that they generally have connections with said server owners who might be willing to be more flexible with the payment.) And once they have that server and all that, then you start deciding the programming language and packages. Cold Fusion? Php? ASP? MySQL? CGI? C++? HTML?

After that has been chosen, then you can get programmers. Not before. Seriously, how can some one volunteer, or some one who can actually code well enough to do what needs to be done, to program if they don't know what it is? A cobol programmer might be able to do C++ or PHP but probably not their preference and probably not going to be a good fit. Get the programmers. Agree on standards. Get samples. Because you don't want some one who has never programmed or only taken an intro class before being slotted into a complex project like that. And all of that needs to be done BEFORE the site is actually worked on coding wise.

Re: Thoughts on the problems of a large archive, or why that discussion bothered me... Part 3
The programmer staffing thing works like this: you hire the architect/lead engineer first. Then that person makes design decisions and top-level implementation choices, with a number of factors in mind. The eventual feature set, the anticipated load, the resource constraints, and so on. You don't want non-engineers making those decisions. Then the architect + the project manager recruit the remainder of the staff based on the skills required. You politely turn away the eager and helpful but tragically inexperienced volunteers.

Recruit the project manager first of all, and hope to gawd you've found a saint willing to work for the meager glory and the resume item.
Thoughts on the problems of a large archive, or why that discussion bothered me... Part 4
After that, it needs to be diagrammed out, modulized out. Is there open source that could be integrated in? How do the various modules fulfill the mission of the site? Are they feasible? Will they be scalable? Diagram, diagrem, plan, plan.

And only after that can you even begin to code the project. And that will take time, a lot of time. It should take at least a month, maybe six. And realistically, I can't see most people taking those steps, doing the t steps to do it right. I can see ample demonstration of that in that thread when they failed to talk about those issues, when they didn't talk about other archives, what they could learn from their mistakes and successes. And good luck finding programmers and a person to be the person where the buck stops. They are destined to get shat on and beaten up on by fandom. I've had my own lovelyness with idiotic users. I've seen the shit that people have blasted at Xing. I distinctly recall the crap fest that people gave to Sky Hawke. I remember the decries of ELITISM IS EVIL! for Fic Wad. I remember the howls about AdultFanfiction.Net not always being available. (History again eh?) You really would have to be willing to stick your neck and wallet on the line... and fandom rarely rewards that.
Re: Thoughts on the problems of a large archive, or why that discussion bothered me... Part 4
And good luck finding programmers and a person to be the person where the buck stops. They are destined to get shat on and beaten up on by fandom.

The key insight is that the audience for this archive is the reader. Not the writer.

Sorry? Where is this written?

My interpretation was more along the lines of: the users of this archive will be both creators and consumers. My assumption, based on the impetus for the post (i.e., the FanLib thing), was that the desire was for a place that would be responsive to those who create the content. The writers, in other words, and vidders and artists, and those who wish to enjoy their work.

Have I misread?

(via metafandom)
That would be my insight about what would be required for a successful design. Writers' needs would also matter, in my approach, but they wouldn't be the primary driver. Readers outnumber writers by at least a hundred to one, and that estimate is probably conservative.

The point can be argued, and no doubt will be. At length.
astolat is the technical person for Yuletide, so she does have some experiance with this kind of thing.
I'm not that familiar with yuletide. Can you give some perspective on:

1. Number of authors, contributors;
2. Number of stories;
3. Amount of traffic for the website;
4. Amount of harddrive space the website occuppied;
5. Demographics of users; and
6. The hardware and programming language used for the site.
Your analysis is sadly true, and scalability is one of those issues that's not pretty but is necessary to architect in from the beginning. Someday, we'll find a rich person to bankroll this stuff.