Check out my first LIVE CASE STUDY and watch me build a 300,000+ page site! I show everything – domain, Google analytics, SEO strategy...

| Live Case Study: The case of 300,000 pages and counting.The Man Who Sold the Web Blog


Live Case Study: The case of 300,000 pages and counting.

11 Apr

This is my first live case study.  In this case study, I will build and grow an autoscale, autopilot, value-add site from scratch.  The purpose of this case study is to demonstrate techniques in real time.  With the exception of coding the initial site (these activities are tabulated under day 0), everything is done in real time, including domain registration.

The subject of this case study will be a a niche jobs search site, built off the Indeed API.  Indeed.com is an established jobs search engine of US-based job opportunities.  They have a publishers program with an API that allows our site to pull the job results.  For job seekers that click through to Indeed, we will also get paid as a publisher.  A nice added bonus, eh?

Our niche job search site will focus on clerical jobs.  Within clerical jobs, we have 3 sections: 1 for accounting jobs, 1 for bookkeeping jobs, and 1 for auditing jobs.

Now, how does the autoscale work?

First, upon launch, the site will have 300,000 pages.  Note, this does not mean all 300K pages will be indexed by Google.  It only means a Googlebot will find 300K unique pages across our site.  Here’s how I came up with that estimate.

The US has about 40,000 active zip codes.  For each zip code, our site will have 3 unique pages–1 for accounting jobs in that zip code, 1 for bookkeeping jobs, and 1 for auditing jobs.  That’s 40,000 * 3 = 120,000 pages right there.  Within each of those pages, we will list 15 job listings that are provided through Indeed’s API.  (I can list more jobs if I wanted to, but I thought 15 was a good number.)  Each of these listings will have a page on our site.  Let’s assume that only 10% of the job listings are unique, which is a conservative estimate.  This adds another 120,000 * 15 * 0.10 = 180,000 pages.  There we go.  In total, we have 120,000 + 180,000 = 300,000 pages!

That’s just the starting point.  With time, more clerical jobs will be posted on Indeed, which translates into more listing pages on our site.  Therefore, the 300,000 will continue to grow–and at a good pace, since clerical jobs represents the largest occupation in the US!

Now, let’s put down some goals for this site that we can measure against.  I’ve listed some below.

  • Create a massive, autoscaling site via an API. This is the subject of my recent blog article.  As you can see, the entire site is architected based on this principle.  Our site is already 300,000 pages deep, so I consider this goal accomplished.  We’re starting off on a positive note! 😀
  • Get the site indexed on the day of domain registration. We will do this using the Rapid Google Indexer, which is still free, guys!
  • Get over 1,000 pages indexed by Google within 1 week! I will do this using a SEO strategy that takes into account all 6 levers to SEO.
  • Demonstrate the power of long tail search traffic. No quantification of this goal yet.
  • More goals to come (possibly)… You’re welcome to leave suggestions in the comments.

Instead of adding more goals, I’ve decided to just track the key milestones achieved so far:

  • Site indexed by Google in 1 day. See Day 2 update.
  • 1,000 pages indexed in its first week. See Day 6 update.
  • 21,000+ pages indexed in less than 21 days. See Day 18 update and check out 21K in 21 Days.
  • Site profitable in its first month. See Day 28 update.
  • Site making $100+/month by month 2. See Days 49-55 update.

Now, how much will all this cost?  Hey, I’m a cheap guy, so the only money I’m willing to invest right now is the cost of the domain, $7.50.  Everything else will not cost me anything extra, because I will be leveraging my existing web hosting account, tools, and other resources, all of which have already been paid for.  If you are trying to replicate this experiment, your costs will be the domain name ($7.50 with a GoDaddy coupon) and price of hosting ($2.95/month for unlimited space and unlimited bandwidth with this promo link).

You may have noticed the initial goals do not focus on monetization.  That’s correct.  In the initial phase of this case study, I will only focus on SEO and Google indexing.  The second phase will focus on monetization strategy.

Once again, this is a live case study.  The beauty of that is there’s no smoke and mirrors.  What you see is real.  You can even track the progress yourself, since I will be divulging the domain name once registered.  The risk I am taking is that this case study may fail.  Techniques that have worked for me in the past, may, for whatever reason (maybe from a Google algorithm tweak?), not work this time around.  I will have to react and adapt on the spot.  Will I be able to deliver under pressure?  We’ll find out. 😀 Either way, I hope this will be great learning experience for the both of us.

So, sit back, relax, bookmark this page, make yourself a cup of tea, and check back often.  I will be updating this article directly with new updates.  Any timestamps will be in PST.  (I live in sunny Southern California, after all.)

And so it begins…

 

Day 0 Activities

It took me about 5-6 hours to create the code for this site.  That time includes 2 meal breaks.  But, keep in mind, I’ve been doing this kinda stuff, on and off, for over a decade, so I’ve become very efficient.

Even though the site has 300K pages, it only has 8 files.  See the screenshot below to understand the site architecture.

Here is a breakdown of each of the 7 files:

  • .htaccess – This creates the 120K SEO-friendly zip code URLs.
  • header.php, footer.php – Just a standard header and footer that I’m including on all the pages.
  • index.php – The homepage.
  • zip.php – This is the zip-based search results page.  In other words, this file is responsible for 120K pages (40K for accounting, 40K for bookkeeping, 40K for auditing).
  • listing.php – This is the individual job listing page.  In other words, this file is responsible for 180K (and growing) pages.
  • browse.php – This page lists all of the US states and territories.  It’s kinda like a sitemap.  This is very important for Googlebot to come across.

This site is based on PHP and MySQL.  I have 1 MySQL database, which stores my 40,000 US zip codes.  Almost all the content is generated automatically through Indeed’s Publisher API.  The below diagram illustrates how the Indeed data feed and database is being integrated.

 

4/11 – Day 1 Activities

12:25AM

Okay, I just registered the domain name at GoDaddy for $7.67 (i.e. $7.49 + fees) using the coupon code FAN3.

>> Unveiled: LocalClericalJobs.com.

Here is a screenshot of the purchase, which shows the timestamp.

Even though this case study is not meant to illustrate monetization techniques (I’ll save that for a future case study!), I decided to slap on a CPA offer.  This is an offer about data entry jobs (very relevant to my targeted visitors!) that I found off MaxBounty.  Here is a direct link to the offer for your perusal.  I have close to zero expectations for this CPA, but it will at least pay for the cost of the domain.  Also, see how I designed the site so that it could be monetized with ads.  The entire right-most column is completely open real estate right now.

1:18AM

I just ran the Rapid Google Indexer for the domain using the keyword “local clerical jobs.”  I also installed Google Analytics.  After some traffic starts to roll in, I will share screenshots from Google Analytics to illustrate long tail SEO.

I won’t be doing any SEO until the site is indexed!  This is to illustrate that using the Google Rapid Indexer alone will get your domain indexed.

Okay, time for bed.

 

4/12 – Day 2 Activities

Just checked and saw the site’s been indexed.  There are only 2 pages indexed, but it’s barely been a day.  See screenshot.

Today and over the next few days, I will focus on SEO.  Do you know the 6 levers to SEO?  I will work on a SEO strategy that spans all 6 levers.  It’s a strategy that I developed in the past few months, which I have been using quite a bit as of late.  I call it the SEO Matrix.

It involves creating links from a diverse number of site types, including:

For depth, I will create 4 tiers of links, with the lower quality forum links at the 4th tier.  There will be random interlinking across tiers and among links of the same tier–hence the name SEO Matrix.

I will create 4 SEO Matrices, one for the keyword “clerical jobs in Santa Monica,” one for “accounting jobs in Pasadena,” one for “bookkeeping jobs in Glendale,” and one for “auditing jobs in Malibu.”  These are all cities in and around Los Angeles, somewhat arbitrarily chosen.  If you check Google Adwords Keyword Tool, you notice these are long tail keywords–meaning, there are 0 forecasted monthly searches for these terms.  Call me crazy for spending any effort working on these terms, but I’m doing so to demonstrate the power of long tail searches.

Doing this work may take up to a week.  I will post an update when it’s fully complete.

I have some niche business directories that also span 40,000 zip codes each.  From each respective zip code, I will be linking to this new site.  This is an easy thing to do, so I will complete it today.

9:51PM

Just checked Google and saw we have 31 pages indexed now.  Still very short of our 1,000+ pages indexed goal by Day 7!

4/13 – Day 3 Update

  • 92 pages indexed by Google

4/14 – Day 4 Update

  • 129 pages indexed by Google

4/15 – Day 5 Update

  • 361 pages indexed by Google

4/16 – Day 6 Update

  • 2,910 3,340 pages indexed by Google. Third goal accomplished! At this point, Google should start sending over long tail searches to LocalClericalJobs.com.   I will begin sharing Google Analytics screenshots in the near future.
  • Only 2 of the 4 SEO Matrices completed so far.
  • I may consider more aggressively monetizing the site.  Haven’t decided yet, but will keep you updated.

4/17 – Day 7 Update

  • 3,960 pages indexed by Google
  • 12 visits from unique visitors, 7 of which came from Google.

4/18 – Day 8 Update

  • I’ve noticed the number pages indexed by Google is varying between 2,600 and 3,960 pages.  Right now, it’s at 3,960.
  • I will be converting the job listing pages to SEO-friendly URLs, which should help increase the number of pages indexed.  This is an easy change that I will do today.
  • 30 unique visitors, 13 of which came from Google.

4/19 – Day 9 Update

  • 6,510 pages indexed by Google.  Another significant jump today after some dancing yesterday.
  • All 4 SEO Matrices are now complete!
  • 15 unique visitors, 11 of which came from Google.

4/20 – Day 10 Update

  • 8,080 pages indexed by Google.  No signs of any further plateauing! EDIT: Seems to fluctuate between 8,080 and 3,790.
  • 22 unique visitors, 11 of which came from Google.

4/21 – Day 11 Update

  • 10,300 10,100 pages indexed by Google.  We broke into the 5-digit mark. 😀  It’s only been a week and a half and we already have a site that can provide us with over 10K indexed backlinks (in addition to 290K unindexed backlinks)!
  • 20 unique visitors, 14 of which came from Google.

4/22 – Day 12 Update

  • 11,000 pages indexed by Google.
  • 15 unique visitors, 14 of which came from Google.  Tomorrow, I will post a Google Analytics snapshot of long tail keywords used to find LocalClericalJobs for the full week of 4/17.

4/23 – Day 13 Update

  • 8 unique visitors, 6 of which came from Google.  The decline in search traffic is presumably tied to a decrease in job searching activities done over Easter weekend.
  • Here are all the search terms use for the week of 4/17.  Click here to view screenshot. Note a few things:
    • 1) Search traffic is coming entirely from long tail search terms.
    • 2) These long tail search terms are not for the 4 specific terms related to my 4 SEO Matrices.  This is why when I created them initially, I stated that they were “arbitrarily chosen.”  These purpose of the initial, one-time SEO effort is merely to get the site indexed deeper.
    • 3) I did some spot checking and noticed for all the terms, we were ranked on page 1 of Google.  This shows that long tail traffic is incredibly easy to rank for, because no one is targeting those terms.  However, you will get few searches a month, so the strategy is to go for bulk.  That is why this SEO strategy works so perfectly for a mega site like ours, with 300K+ pages upon launch and growing.

4/24 – Day 14 Update

  • 15,300 17,000 pages indexed by Google.
  • Traffic stats to be updated weekly from now on.

4/25 – Day 15 Update

  • 16,000-17,000 pages indexed by Google.

4/26 – Day 16 Update

  • 18,600 19,900 pages indexed by Google.

4/27 – Day 17 Update

  • 19,800 pages indexed by Google.

4/28 – Day 18 Update

  • 21,300 pages indexed by Google.

4/29 – Day 19 Update

  • 21,300 pages indexed by Google.  No change today.

4/30 – Day 2o Update

  • 22,100 pages indexed by Google.  Starting tomorrow, I will begin to more aggressively monetize the site.  I will also be posting this past week’s Google Analytics screenshots tomorrow.

5/1 – Day 21 Update

5/2 – Day 22 Update

  • 23,000 pages indexed by Google.

5/3 – Day 23 Update

  • 23,900 pages indexed by Google.

5/4 – Day 24 Update

  • 24,600 pages indexed by Google.
  • Just got approved for Adsense and added an Adsense 250×250 unit to the results page.  Doesn’t seem like the ads are loading yet.  It’s been a while since I’ve used Adsense, so I assume there is a lag between the ads begin to be served.

5/5 – Day 25 Update

  • 24,600 pages indexed by Google.

5/6 – Day 26 Update

  • 22,400 pages indexed by Google.  Ah, a noticeable drop today!

5/7 – Day 27 Update

  • 22,500 pages indexed by Google.
  • A couple days, I analyzed all my data from the first 21 days of this case study.  I compiled this into a book 21K in 21 Days that you can freely download.  Check it out, because I extracted some key insights that counter traditional “Internet Marketing” beliefs.  Also, I included additional screenshots from Google Analytics and other data points.

5/8 – Day 28 Update

  • 22,400 pages indexed by Google.
  • Here is the Google Analytics update for the week of 5/1:
  • Here is the monetization update for the week of 5/1:
    • Google Adsense: $0.76  (I added a couple more Google Adsense banners to the site today.)
    • Indeed: $6.61
    • MaxBounty: $0  (If MaxBounty campaign doesn’t seem to convert, I may replace it with a ClickBank offer or remove it altogether.)
    • Note that in our first week of monetization, we’ve essentially recovered our cost of investment (i.e. the $7.50 domain)!  If I include the Indeed earnings prior this week, then I have already fully recovered my cost of investment and am profitable.
  • Going forward, I will be updating stats once a week (on Sundays).

5/12 – Day 32 Update

  • So, a reader was nice enough to direct me to this thread on WarriorForum.  I found it quite interesting, because it describes how someone else also created a site solely based on Indeed.com’s Publisher program.   The key difference is that his site is just a landing page that will redirect traffic to Indeed.com.  Therefore, it’s monetization strategy is solely based on Indeed’s Publisher program; also, it’s marketing efforts are solely for the homepage (since that’s the only page the site has).  He is trying offline techniques to market the site–e.g. distributing business cards, posting flyers related to finding jobs.  With the case of LocalClericalJobs, we have built a mega site, which has allowed us to monetize through traditional advertising and market via long tail SEO.  Long tail SEO is a much more passive way of marketing and on autopilot, whereas going out to reach offline consumers is clearly more proactive and requires constant work for results.
  • From the thread, I also learned that I can sell job postings via Indeed’s Publisher program.  You set the price you wish to charge for a job posting.  Indeed charges a $20 processing/administration fee, so your profit is anything above $20.  I just integrated job postings to LocalClericalJobs and priced it at $149.  Therefore, for each job posting made through our site, we’ll net $129.  FYI, Indeed claim the average job listing price is $200 within its network.  Let’s see how this works out!

5/15 – Week of 5/8  (Days 28-34) Update

  • SEO update:
    • 21,900 pages indexed by Google.  (The number of pages has been fluctuating and declining a bit.)
  • Google Analytics update:
    • Traffic overview: 150 visits, 391 pageviews (78% US)
    • Long tail keywords: 84 keywords
    • If there are other stats you would like me to pull from Google Analytics, please let me know.
  • Monetization update:
    • Google Adsense: $6.56
    • Indeed: $1.84
    • MaxBounty: $0
    • Total: $8.40

5/22 – Week of 5/15  (Days 35-41) Update

  • SEO update:
    • 22,800 pages indexed by Google.
  • Google Analytics update:
    • Traffic overview: 194 visits, 480 pageviews (72% US)
    • Long tail keywords: 79 keywords
  • Monetization update:
    • Google Adsense: $0.41
    • Indeed: $2.82
    • MaxBounty: $0
    • Total: $3.23
    • Ah, a pretty low performing week in terms of monetization.  This upcoming week, I will remove the MaxBounty offer and begin to monetize more aggressively by selling text ads directly to advertisers.
  • Here’s a big announcement!  Earlier this week, I released the turnkey version of this case study, 300K Page Job Search In-a-BoxClick this link for a 30% discount , exclusive to my blog readers.  This coupon will expire after a fixed number of uses.

5/29 – Week of 5/22  (Days 42-48) Update

  • SEO update:
    • 47,500 pages indexed by Google.  Quite a leap!  I’m curious now as to how long it will take to break into the 6 digits.
  • Google Analytics update:
    • Traffic overview: 227 visits, 629 pageviews (63% US)
    • Long tail keywords: 111 keywords
  • Monetization update:
    • Google Adsense: $0.70
    • Indeed: $0.39
      • So the Indeed revenue seems to fluctuate quite a bit and can be highly dependent on the niche.  I have one customer who bought the 300K Job Search In-a-Box (25% discount code embedded ) last week and has seen over $20 from Indeed in a single day–in fact, literally his second of operation!  With permission, I have include a screenshot of his earnings for the first 4 days (totaling $37.78): 300K Job Search In-a-Box customer Indeed earnings.
    • MaxBounty: $2.70.  I will keep up the MaxBounty up for now, since it generated some revenue this week.
    • Direct Ad Sales: $8.00
    • Total: $11.79
    • I ended up going after direct ad sales only yesterday, when I posted 2 advertisements on webmaster forums (one on WarriorForum and one on DigitalPoint) and sent an email to my newsletter.  I do have $100+ in pending ad sales from people who have expressed interest.  I expect most of that $100+ to hit in next week’s update.

6/1 – Day 52 Update

  • The blog discount of 40% off 300K Job Search In-a-Box will only last through this Sunday, 6/5! EDIT 6/6: Your exclusive blog reader discount is now 30%. EDIT 6/20: Your exclusive blog reader discount is now 25%.
  • Here’s a teaser for Sunday’s update… I’ve already made $100+ this week from direct ad sales and we already have 52,600 pages indexed by Google!

6/5 – Week of 5/29 (Days 49-55) Update

  • SEO update:
    • 56,900 pages indexed by Google.
  • Google Analytics update:
    • Traffic overview: 394 visits, 1,045 pageviews (70% US)
    • Long tail keywords: 240 keywords — note this over a 100% increase from last week
  • Monetization update:
    • Google Adsense: $1.88
    • Indeed: $6.20
    • MaxBounty: $0
    • Direct Ad Sales: $116 — this includes an annual payment
    • Total: $124.08

6/12 – Week of 6/5 (Days 56-62) Update

  • SEO update:
    • 58,900 pages indexed by Google.
  • Google Analytics update:
    • Traffic overview: 168 visits, 470 pageviews (72% US)
    • Long tail keywords: 92 keywords
  • Monetization update:
    • Google Adsense: $4.21
    • Indeed: $3.82
    • MaxBounty: $1.35
    • Total: $9.38
    • I didn’t make any direct ad sales this week.  Next week, I will try to get more direct ad sales.  Once I get to the point where I will average $100+/month, I will only update this case study on a monthly basis.
  • In related news, I’ve been receiving great responses from customers of 300K Job Search In-a-Box.  Check out this customer testimonial.

6/13 – Day 64 Update

  • Man, already into month 3 of the case study.  Time flies!  Anyway…
  • Today, I released the Hosted Solution for 300K Job Search In-a-Box.  With the Hosted Solution, I’ve saved you all the technical headache of script installation, configuration, and maintenance.  You don’t even need to deal with hosting fees anymore!  All you need to do is register the domain name and then point it to my cloud.  Click here for the Early Bird 25% discount.  The Early Bird discount will expire after a certain number of uses and the discount is for life!
  • In other news, I posted an ad to sell the contextual search results ads on the SitePoint Marketplace.  I’m listing this initially for $11/month.  See my listing here.

6/19 – Week of 6/12 (Days 63-69) Update

  • SEO update:
    • 58,800 pages indexed by Google.
  • Google Analytics update:
    • Traffic overview: 105 visits, 266 pageviews (52% US)
    • Long tail keywords: 16 keywords
  • Monetization update:
    • Google Adsense: $2.07
    • Indeed: $3.97
    • MaxBounty: $0
    • Direct Ad Sales: $9.00
    • Total: $15.04

7/2 – Conclusion of the Case of 300,000 Pages and Counting

I’ve decided to end this case study before I got way too boring for folks and end things on a high note.  Revisiting the milestones, I would consider this case study to be a successful one.

Key milestones achieved:

  • Site indexed by Google in 1 day. See Day 2 update.
  • 1,000 pages indexed in its first week. See Day 6 update.
  • 21,000+ pages indexed in less than 21 days. See Day 18 update and check out 21K in 21 Days.
  • Site profitable in its first month. See Day 28 update.
  • Site making $100+/month by month 2. See Days 49-55 update.

Looking back, here is the approach we took for the site.  First, we leveraged an API as a source of autoscaling content.  Next, pairing that with a US zip code database, we created a megasite of 300,000+ pages of unique content.  Upon the site’s launch, we focused our marketing efforts purely on SEO with the sole purpose of getting as many pages indexed by Google as quickly as possible.  As more pages were indexed, we started receiving more long tail search traffic, driving up both Indeed and Adsense earnings.  At a certain point, we began to monetize more aggressively by going after direct ad sales.  Our value proposition to the advertisers was selling them 50,000+ indexed and contextual backlinks.  Indexed and contextual backlinks provide a great source of SEO linkjuice.  I only sold subscription-based ads (e.g. $x/month), so that monetization would be on autopilot.

If you’re looking to replicate this case study for yourself, check out 300K Page Job Search In-a-Box.  With the Hosted Solution, I take care of all the technical stuff.  All you need to do is register the domain, point it to my cloud, then do some SEO at your own pace.  You’ll have your own megasite up and running in an hour.

Thank for you reading this case study!  I hope to start another one soon!

dave

47 Responses to “Live Case Study: The case of 300,000 pages and counting.”