iBegin Blog

08 Nov, 2007

An open call - How to help non-commercial users?

Posted by: Ahmed Farooq In: Other

I’d really appreciate thoughts and comments on an important issue we’ve been thinking over.

For those that have been around for a while - you know our tendencies are slightly off kilter. How else do you explain an 8 point philosophy?

I came to Canada when I was 14. Starting off with $0 (I had no allowance), I’ve grown the parent company (Enthropia Inc) into a multi-faceted organization with over 25 employees and 6 divisions.

Due to my zero-sum beginnings (and being a naive immigrant integrating into high school is no fun either), I’ve always had a soft spot for the upstart. Someone that has the guts and passion to do something, but needs some support. I even created a scholarship at the University of Toronto that specifically rewarded students who had a job and had an average just below Honors (80%). The money would allow them to work a little bit less, study a bit more, and get onto the Dean’s List. I did this while a student myself (third year engineering). I’m not bragging here - I just want to underscore my commitment to the little guy.

So when we originally had the idea of iBegin Source and we began discussions with data providers and partners, I was adamant that we have a non-commercial option. While $1000 may not look like a large sum to some people, to others it is too much to spare. If there is one area of the internet that needs innovation - it’s local. So we wanted to help developers create something unique.

At the same time - there was no physical medium to track. So as a trade-off, the non-commercial listings had the phone numbers removed (to thwart telemarketers) and also geocoding removed (a separate service of iBegin - consider it a value-add).

We’ve had some interesting apps developed. We’ve supported projects we thought were interesting with the full data set. Overall I think we’ve done a fair bit to change the landscape - but it isn’t enough.

Two issues have arisen:

1. The lack of phone # and geocoding turns off people we want on our side. They understand our situation, but not having phone #s and/or geocoding limits the ‘usefulness’ of what they can build.

2. Some people have an odd sense of entitlement. Bulk mail companies have taken our data and used it for commercial purposes. My favorite was when a company emailed us thanking us for our data, and recommendations on how we could make it easier for them to send out mass-mails.

Before going further - I want to clarify. iBegin Source itself is a success. We even get fan-mail. The product itself is doing great - we did just launch Canadian data.

And while US data is (relatively) common, Canadian data is much harder to come across. It is much harder to get financing for Canadian-focused companies. This post is about fostering innovation in the local space (with our data as a critical component). We want to give users access to the full enchilada (phone numbers and geocoding). At the same time, we want to make sure businesses don’t see it as some freebie and use the data commercially without paying.

So the question is - “What should we do to make sure people with a great local idea have the data they need?”

A few thoughts and ideas we’ve kicked around (I’m keeping my personal opinion out of them):

  • Require the end-user to submit a little description (50-250 words) on what the data will be used for.
  • Add a small surcharge ($5-$25). This way you also have billing info, and so removes anonymity.
  • Make non-commercial physical only. For a shipping & handling fee, you will Fedex them a DVD with the data.
  • Create an API.

I look forward to your thoughts.

UPDATE: Great ideas already under way. I want to emphasize - the non-commercial requirements still apply. This is more about getting the data to people who can do interesting things with it. We are very cognizant of the fine-balance required between commercial and non-commercial users (if there were no commercial users iBegin Source wouldn’t work).

24 Responses to "An open call - How to help non-commercial users?"

1 | Wojjie

November 8th, 2007 at 8:17 pm

Avatar

Take the Google maps approach and offer an API with a limit on the number of queries per day/month.

Also provide the users with keys to use the API based on their email and server’s ip address, and ban any ip/key that is not adhering to your AUP.

2 | Michael

November 8th, 2007 at 8:18 pm

Avatar

I think manually approving the projects is the way to go. You could offer the full package for non-commercial users, and keep marketing companies out of the code all together.

It may be time consuming, but it would be very hard for marketers to get it that way.

It might be interesting to have some of sort of project database so we could see what kinds of projects were being created with the source material. I’d be very interested to see what people were doing with it. Both non-profit, and commercial projects.

3 | Krispy

November 8th, 2007 at 8:20 pm

Avatar

A description of what they want to do is one thing but its far from a way to vet anyone who would use the data for nefarious purposes. I think it may be prudent to socialize it a bit - sort of a whos who 3rd party style. Can someone else vouch for your intentions? That brings up the issue of “well I don’t know anyone?” - I think if people have to put their reputations up for collateral that may deter some from using the data in a bad way. Also, maybe just have the API - at least with that you can verify where the data is being pulled from. In the end there really is no sure way to prevent “bad” usage of the data.

I think a combination of a description and api makes the most sense.

Anyway, thats my 2 cents. (And with the Canadian dollar trading above $1.10 thats like 2.2 cents! )

4 | Krispy

November 8th, 2007 at 8:23 pm

Avatar

One more thought:

give full sample data on a specific region then ask them to submit a request demonstrating the app they have built and then once demonstrated they can have access to the full set. I mean you don\’t need all the data in order to develop an application. You only need it all once you deploy it. Right?

5 | David Cramer

November 8th, 2007 at 8:30 pm

Avatar

You should offer blog data too!

6 | Mike Bowden

November 8th, 2007 at 8:31 pm

Avatar

Ahmed,

First of all I’d like to applaud your efforts in making more data and resources available to the little guys. I’m not sure if our little chat the other day had anything to do with it, but I think it’s a great idea with the way you’re steering the company and the more resources that are offered to the startups or incubators out there, the better off the net and local communities will become.

I’ve had plenty of ideas for the data you provided me for my local community, but haven’t had time to sit down and hash everything out fully. But being able to come to one of the “big boys” and get the proper information and know 100% that it’s accurate and legit, means a lot and will make any startup water at the mouth.

As far as how you should handle the releasing of the data, each idea you laid out should be used fully. Not one or the other, but all of them. Not only will this allow you to keep tabs on everything that’s going on, it will give you a larger insight on the projects being developed. Which in turn could produce potential funding from Enthropia, INC. or even other Angel Investors.

I wouldn’t think a full API system would be the best route on that front, but an API system would help those users that would like updated data and a way to access it without totally cleaning and redoing their databases. It also allows you as a provider to keep tabs on who is accessing the information and for what reason. As far as sending data via DVD, I think that’s a great option to offer but one that shouldn’t be mandatory. Simply because some companies may want to have this information on a backup somewhere, or a hard copy, but also may not want to wait on the new data to arrive when something is updated or added and with local data, new information is added, corrected and deleted on a daily basis.

All in all I think it’s a great idea and I think that a combination of all of the systems suggested should be implemented, but only a select few mandatory. Great job my friend and please, keep up the good work.

Mike Bowden

7 | Ahmed Farooq

November 8th, 2007 at 8:37 pm

Avatar

Whew - thanks guys, a lot to mull over.

There are two main reasons I don’t like the API idea:

1. Because that isn’t how the commercial data works. We want the move from non-commercial to commercial to be relatively seamless. Going from API to download can cause headaches (and we like to minimize those)

2. It stresses our servers. Both bandwidth and CPU-wise. Of the four options I suggested, it causes the most headaches of them all.

Anyway - keep it coming. We want to see jazzy stuff going on in the local space :)

8 | Andre Marcelo-Tanner

November 8th, 2007 at 8:38 pm

Avatar

Theres no way to stop people from scraping you’re data, its either you offer it for free or dont, people are probably scraping it already except for the business data exclusives. If you dont offer it for free totally, someone else will, in the end its about who gets known for doing it and becomes the #1 source, the go to guy, cause google and yahoo can do the same thing but only 1 will be more popular. Honestly you will make change if you can make enough noise, or else you get drowned out in all the traffic.

9 | Krispy

November 8th, 2007 at 8:45 pm

Avatar

You could always just throttle the update time on the free data. And the comercial data gets more “servicing” as it were. Like what the Linux Co’s do. Give the OS for free but charge for support.

10 | Frank Michlick

November 8th, 2007 at 8:57 pm

Avatar

I doubt that anyone else will really offer this type of valuable data for free within a short period of time. Also I agree with the concerns over offering an API based service, since that would require a lot of work and maintenance from your end for a free service. Don’t like the idea of mailing DVDs much either.

So with that being said, I’d probably go with a small application fee, i.e. the $25 you suggested and use that money to pay for someone to review the applications.

And if the idea is really great you take an equity in the business if it takes off ;-) Half-kidding here.

In any case, sticking with a data download vs. API gives you a problem with tracking how and where the data is used, so that’s the main point FOR an API.

/Frank

11 | Marc Miles

November 8th, 2007 at 9:22 pm

Avatar

Offer a subset of full data. Say the full dataset for Seattle Wa. This gives users tens of thousands of rows of data which is plenty to play around with in a development environment, however, keeping it limited and still giving them a taste of what they could have.

With this you may loose out on purchases for that small subset, but that loss is much less than the loss you are probably making now. And of course still allow for a full release of non commercial data (no geocode or phone).

The trick is, you need to allow the developer to go through their entire development life cycle with the development data. The small commercial and free dataset allows for application development, and the full non commercial dataset allows the developer to integrate process’s to manage updates etc.

12 | Shappy

November 8th, 2007 at 9:29 pm

Avatar

Somewhere in each of us is the desire to help the little guy. At the same time, for those of us who have invested significantly in your commercial license (up to $48,000), it is very concering that the data could just be freely distrubuted to anyone who states that they cannot afford it.

Your scolarship, for example, went to 1 student or a small number of the most deserving students, not to every student in every school who had to work a job to pay for school.

I know that my oppinion will not be popular but I just dont think it is fair to help everyone at the expense of those of us who have invested in the commercial license. While I may have more money that the guy without any, the amount i invested in iBegin represented a HUGE percentage of our total operating expenses.

Having the data all over the place significantly cheapens its value to those of us who did pay, both in terms of Google unique content and end-user value.

I like all the ideas raised so far for controlling illicit uses of the data but more than anything I sincerely hope that even approved uses are limited to just the most deserving candidates.

Even in socialized countries, there are limits to wealth redistribution. Help the little guy- yes — but dont forget who pays all the taxes that keep the iBegin economy rolling.

13 | Matt C

November 8th, 2007 at 9:41 pm

Avatar

Is there a good way to let the community vet out the intentions of a non-commercial user? Let’s say you do validate the end user with a small payment and have their billing address, what about having a section of iBegin linking to all the non-commercial projects? Having a link to these projects will be a great way for them to draw attention, and the community can provide feedback on how the data is being used.

I think in the end you’ll have to strike a balance between policing the free users and making it easy for them to build apps on the data.

14 | Ahmed Farooq

November 8th, 2007 at 10:08 pm

Avatar

I still want to digest everything, but I wanted to quickly respond to Shappy - the non-commercial is still explicit that it CANNOT be used for commercial purposes.

And yes - there is a very careful balance needed. Which is why this is about connecting to the right kind of people. 1000 developers or 10 - this isn’t about quantity. It is about interesting new apps being built.

Thanks for your thoughts guys, and I look forward to hearing more.

15 | Dave

November 8th, 2007 at 10:23 pm

Avatar

Personally, I think the way you currently have things is great. What other company offers so much data for free for testing??

I agree with the limited subset idea- a small amount of data should be sufficient for testing the project before investing in the full commercial license. For our project, it would be nice to have the phone numbers for some testing, but we can wait until we decide to get the commercial license.

To get the full commercial license for free (or nominal fee), I would say the user should jump through a bigger hoop than just a few hundred word essay. If it’s a school or non-profit, I can see that- just let them provide proof of their status. For others with a good idea but “just can’t afford it,” then have them right a full-blown proposal. Other researchers with limited budgets have to apply for grants, so why can’t they.

As someone who intends to be purchasing the commercial version in a few months, I would be very disappointed to know that people could easily scam a commercial license just by jotting down a few pity-inspiring sentences.

Another thought, what about dropping the requirement for a link back to ibegin.com for comemrcial licenseees, but keeping the requirement for non-commercial users?

16 | Shappy

November 8th, 2007 at 10:24 pm

Avatar

Not immeditely anyway. But someone could use the non-commercial to establsish a large user base and then monitize it down the road, switching to a commercial license at that time. That is one of the most popular Web 2.0 business models around and everyone does it.

So, in effect, those of us who pay would be subsidizing new competitors who got the benefit of being able to start out free and then come eat our lunch.

And even if someone doesnt go commercial, that doesnt mean that a non-commercial use couldnt hurt our commercial use. Since they dont have to pay for the data, they dont need to necessiarly make money off it to pay for the data so they can give away competing services. Nothing to stop someone from creating a free service that competes with my commercial service.

now, you may say that that is the free market and you would be right — except — if they had to pay for the data in the first place (like we did) then the free market probably wouldnt allow them to survive.

I think it is about 10 or 1000 just as much as it is about quality. If quality and the best apps is what it is really about, then that means only the best of the best of the best get the scolarship/goverment subsidy — no different than an SBA loan.

I admire what you are trying to do and hate to be the republican ass that people will say doesnt care about starving children. I just want to share a contrarian oppinion from someone who paid up BIG to get the license and would hate to be competing against someone (commercial or not) who got for free what i paid up to $48,000 for.

I would like to see a cap of say x% so that if you have 1000 paying customers, you would only award 10 fully-featured non-commerical grants to the most deserving candidates.

17 | Ahmed Farooq

November 8th, 2007 at 10:49 pm

Avatar

Okay - got some time to read and think a bit :)

The non-commercial license is a twofer for us - helping new ideas develop quickly, and letting people prototype a commercial product before they go ahead and purchase (part of our open-mantra). Consider it a soft upsell - if you want to make money, you have to support iBegin Source. So its a win-win for everyone involved.

I quickly posted about why I don’t like APIs. But it may turn out to be the best way to control the flow of data. Then again, we don’t have the infrastructure Google or Yahoo has :)

Frank’s idea was intriguing - an application (as others have liked) and an application fee to cover our looking over it. I’ve been a big fan of small transactions (~$25) to weed out non-serious from serious people. The application ensures we have a clue what is going on, and the fee implies a bit of seriousness and also billing/contact address to touch to.

Marc’s idea is also intriguing - we’ve considered releasing the first 10-25k records for a few major cities (eg San Francisco, New York, etc - where there are roughly 50k+ businesses).

An extension of that idea would be to *completely open up* one city. Say - San Francisco (as so much goes on here).

Again - these are various ideas that can be mashed together. Eg application fee + idea gets you full access to SF and 25k records from NY. A mix and match of various ideas.

Matt - while intriguing, we also are very careful of privacy here. I don’t think developers want to announce what they are upto and then have someone steal the idea. Community vetting works (imo) when the end goal is the same (good business data). When it comes to competing websites, I get a bit more cautious then :)

Dave - commercial licenses have never had to link back to us. Part of our ethos - we have data providers we don’t disclose, why should we force you to disclose your connection to us :)

To clarify of the [existing] differences:
Commercial: All data
Non-commercial: All data minus phone # + geocoding

Commercial: No link back
Non-commercial: link back

Commercial: Direct downloads, daily/weekly/monthly updates [direct are important for automation]
Non-commercial: generated link, monthly updates

Commercial: can profit
Non-commercial: cannot generate revenue.

Shappy - your comments hold a lot of water because you are an existing customer. One paying a significant amount. And you are right - their dev costs go down while they establish themselves. Or they build a free altnerative to you. The *bottom line* is without our commercial users we won’t continue, so we definitely want to hear from you (and this is where future purchasers are also important).

Right now I am leaning towards:
1. An ‘application’ with a (possibly refundable) $15 fee. The application would include personal information (nothing invasive such as SSN/SIN). The information would then be used to gauge if the person has the proper skills to develop something, and if they can ill-afford a commercial license.
2. Limit # of approvals per month. They will also be publicly posted (for our commercial users to see what is going on)
3. Releasing a city fully (eg San Francisco), and LA/New York/Houston with 20k entries each (each has roughly 100k+ businesses). This would allow complete testing (as you have both East/West coast and Houston down in the south) without giving up the entire farm.

How does that sound?

18 | Shappy

November 9th, 2007 at 2:19 am

Avatar

I applaud your willingness to have open dialogs with your customers, which can often be a scary and time-consuming endeavor. It is no surprise that you have been as successful as you have due to that commit to your company and customers.

I think the options you identified represent a fair compromise and appreciate your looking at the issue from all perspectives.

Once thing that I don’t understand from some of the other posts is why a developer would need real phone numbers and geo-coding for development and testing. For most development and testing purposes, dummy numbers and zip-centriod lat/long should be more than enough to test field display and mapping.

More than the actual numbers, what would be most important to me for development and testing is knowing how many listings are in each location/category and how many have phone, fax, web url, hours, etc.

The sad reality is that we live in a society where people have little respect for intellectual property, as evidenced by the gazillions of illegal music downloads that happen everyday by otherwise law-abiding good people who wouldn’t walk into a record store and put a CD in their pants. But once the data is put out there, the line blurs and even good people can cross it. The less its out there, the less temptation [and ability] to do wrong.

Again, I think the options you identified are a pretty good compromise and I really do appreciate the open discussion.

We’ll look forward to seeing you put some of these plans in action and how things proceed from there.

In the meantime, we’ll be busy working on our major national iSource implementation, for which we are beta launching 1/1/08. We’ll keep you up to date of our progress as we get closer.

Carpe Diem!

19 | TerryT

November 9th, 2007 at 2:29 am

Avatar

As a likely future customer (who will struggle to find the money for the full US dataset so may have to start with fewer States) I have mixed feelings about “Data Scholarships” that give access to the commercial dataset. On the positive side I think there is merit in granting a small number (as some projects need the phone numbers and geo-data to see if they will fly) but I can see the downside being too many me-too projects that undermine the interests of commercial licence holders if the standards are set too low.

The difficulty I see is judging where to set the bar for eligible projects; set them too low and commercial interests are at risk, set them too high and innovation may be stifled.

I believe true innovation in the local field takes a certain level of knowledge and a lot of dedication. This means that, as harsh as it may sound, there are many capable people out there that will never be able to innovate in this field. Those unable to obtain one of the few “Data Scholarships” should still have the ability to demonstrate their innovations by taking advantage of ‘evaluation data’ or any alternative method decided upon.

I also believe that some financial commitment is desirable as part of any development, the resources required to truly innovate mount up to way more than the $1000 fee for one State. Having said that I know that $1000 in cash is a lot for many people and I would support the idea of a discounted first State in suitable circumstances. Something that I would like to add to this though is that I believe anyone capable of local search innovation should also be able to find the time and have the skills to earn $1000 to buy their first licence at the full rate (a route that is open to anyone right now), however a discounted rate of say $250 may allow a cash-strapped developer to invest in better hosting for their test environment.
Given what has been said above I would suggest that:

1. “Data Scholarships” should be awarded for the first year only, if it doesn’t fly after 12 months it probably never will.

2. Those unable to obtain “Data Scholarships” should be able to purchase their first-year licence for their home State (the State in which they reside - as shown by their credit card billing address) at a discounted rate for commercial use. This approach means they will be working with data that should be more suitable for them and it will reduce the impact of everyone using the same ‘evaluation’ dataset.

3. Non-Commercial Licenses:

I would like to see Non-Commercial licences remain in some form (see below) with the addition of a paid registration process (Non-Commercial does not have to mean free). The fee for registration should be small enough to not act as a barrier but large enough to pay for the cost of screening applications.

A major difficulty I see with Non-Commercial licenses is the real chance that unrestricted use of the data will inadvertently clog the search engines with pages that are too similar. This is a clear threat to the continuing existence of the service if such fears come true.

I suggest the radical step of inserting a clause that makes it clear that all web pages containing Non-Commercial data have to be excluded from being indexed by search engines. This still allows functionality to remain but gives some protection for commercial licence holders (I understand that there are other threats but at least this is one that can be contained).

20 | Colin

November 10th, 2007 at 4:12 pm

Avatar

Having access to free local data is invaluable for a startup and will directly lead to the purchase of this data if the project takes off.

Given the choice, I’d rather have access to the complete data for a given region with removed info such as phone & geocoding - when demoing you’d rather have explicit constraints such as ’search on phone num not available in prototype/demo/beta’ instead of just not finding merchants and not knowing if its because your search criterias are not right or if the data is just missing.

My suggestion is to hand-out small free data samples for developers to be able to play with it in the early stages. Still offer the complete regions data for free (with removed fields) but have the requester go through a manual contractual agreement for the non-commercial use of the data. I wouldn’t have any problem with that.

Colin.

21 | Andre Marcelo-Tanner

November 11th, 2007 at 1:49 pm

Avatar

how bout just create a portal or another site to promote and facilitate the development of ibegin source applications and support the development of ibegin source applications with more features but keep the commercial features there because they arent needed for noncommercial apps. so basically keep what the features of commercial but up support for noncommercial with features like apis or direct downloads and ways to use the data better etc.

22 | Matt C

November 12th, 2007 at 2:42 pm

Avatar

Ahmed- your comment about everyone’s interests prompted a thought: a programmer who can build a good app may lack money, but can also prove commitment to the project with an investment of time. If you decide on an application process, part of the criteria can be number or quality of updates that user has already made to the dataset, either through the source listings or iBegin cities. This will help identify people who are willing to provide a service to the community as “payment” when they can’t afford $1000 or $48,000, and provides some value to the paying clients in the form of cleaner data.

23 | AhmedF

November 12th, 2007 at 3:27 pm

Avatar

Excellent idea Matt - contribute to get some data back :)

24 | Matt M

November 16th, 2007 at 3:55 am

Avatar

I do like this idea:
Someone applies, the data gets shipped out on a dvd because if it’s FedEx, you have to have someone sign for it and you will know the address. If anyone applies for an api online, it can be faked, you may even be able to download the data without agreeing to the form (I tried it on other sites and it can be done). I think an application along with a dvd would be the only true way you would know who is getting it. Still, I think there should be a cost associated with it. $1,000 a state is too much for me let alone $40k for everything. As much as I would love it, it’s too much but I could pay $1000 for more than one state.

Here’s the problem with offering data for a large city. Say I want the Bed and Breakfasts for say, Houston, I don’t want to pay for restaurants, book stores, schools and other data that doesn’t pertain to what I need. Perhaps you could go to a categorical sale offering, that would be much better (think of the Android/Google Maps applications that would be powered with your data).

Also, does it matter if someone wants information for one area or for the whole country? Some of us have grand ambitions…lol

Comment Form


  • Nicole: I did not know where to put this as I have not received any response from the email/feedback form I had previously sent twice in the last month. So I'
  • Lynn Brown: I have just used this Phone Claim Process, it more polite that Googles one, which just hands up on you lol.... Great service.
  • David: Could you tell me what geodetic system/datum your latitudes/longitudes use (e.g., WGS84, NAD1983)? Most US census data tends to be NAD83, but GPS-der

About

iBegin provides services to companies operating in the online local marketplace.