Inside The Deal That Made Bill Gates $350,000,000 - Fortune Features
Awesome article from 1986 about Microsoft going public
Awesome article from 1986 about Microsoft going public
Jeff giving props to the PlentyOfFish guy and his scaling awesomeness. The money quote on both blogs:
The problem with free is that every time you double the size of your
database the cost of maintaining the site grows 6 fold. I really
underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to
service a visit. via
Still, he has managed to build PlentyOfFish on just a handful of servers, maybe I should look into .Net.
Also Check out this look at PlentyOfFish's architecture.
Update June 24, 2009: Jeff just wrote a piece on "scaling up" vs. "scaling out" that also uses Plenty of Fish as the example.
A project I am working on, plus a question in #php today, got me working on how to extract text from word documents in Linux. I have actually tried to figure this out before but the closest I got was the Apache POI project and the Open Office API, though both seemed like more than I needed.
However, today I discovered Catdoc. After wasting an hour trying to get Catdoc to compile while not being root and not having write permissions to the directories Catdoc wants to install to, Google led me to this forum post and to the conclusion that Catdoc is way too much effort and there must be something easier to use, or atleast compile.
Turns out there is, Antiword. Antiword is a lifesaver, just download the Linux source, extract to a directory, type "make all" then "make install" and lo and behold, you can extract the text from word files from the command line. The only problem I had with antiword is that it wants to put antiword into "~/bin/" and other things in "~/.antiword". Which is alright, but I wanted to contain everything in one directory (because I'm crazy like that).
Spending some time with grep, modifying the source, and recompiling Antiword took less time than I had spent trying to get Catdoc just to compile. Basically, you just need to tell Antiword to look for its .antiword directory in whatever directory antiword happens to be located in, this was done by adding this code in options.c around line 220:
/* Try in same directory version of the mapping file */
if (tFilenameLen <
sizeof(szMappingFile) -
sizeof(ANTIWORD_DIR) -
2 * sizeof(FILE_SEPARATOR)) {
sprintf(szMappingFile,
ANTIWORD_DIR FILE_SEPARATOR "%s%s",
szLeafname, szSuffix);
DBG_MSG(szMappingFile);
pFile = fopen(szMappingFile, "r");
if (pFile != NULL) {
return pFile;
}
} else {
werr(0, "same directory mappingfilename too long, ignored");
}Seems to work just fine for me, your results might vary. I've included my compiled binary for Linux with this post, it might work in your environment, it might not.
I also heard good things about WV, but read that it requires root access (I dind't investigate this claim though), which didn't help me since I can't be root. I also found Word2x though I know nothing about it.
I wrangle code for Undrip and sling words for StartupGrind. Previously, I was Co-Founder and CTO of Plancast.
About me: About.me
My Plans: Plancast.com
My Notes: Noopsi.com
My Tweets: Twitter.com
My Code: Github.com
My Resume: LinkedIn.com
My Facebook: Facebook.com
My Google: Google.com