I’ve been this thing for the past few months and I want some people to try it out and give me some feedback! I’m hoping we can build a useful dataset of translations, and then we can start making new DataSources to power new datasets (labeled images, named entity recog., etc.).


Metro allows science projects to be powered by a crowd of people who self-generate the for it. One of its uses is to create open datasets collaboratively, where every contributor is able to access all of the data.

Data generation happens on your computer, using R;DataSources”. A DataSource is a community-made, open-source plugin for Metro, which generates data for you.

You simply install the Metro browser extension and activate the DataSources which power the project. You’ll also need to signup, which doesn’t require email verification right now so it takes about seconds.

As a test-run, I made an Open Data project for gathering sentence-level translations in 7 languages, and I would like everyone to try it out!

[Open Data] Sentence Translations

It’s powered by a DataSource which allows you to highlight any text on the internet, right-click, press a “translate” button, and enter your translation.

You’ll need to 1. sign up, 2. install the extension, and 3. activate the DataSource on the project page

I want Metro to be able to support open-data generation of any scale. Eventually, I want it to be the backbone for startups powered by ethical, self-generated data because it provides access to data from any platform on the internet – “closed” ones like Facebook, Google, etc., while giving users true autonomy over their data.

Any feedback, help, or just usage of the system is really useful for trying to improve the problems that I just can’t see yet. I’m just a guy recently graduated from college so don’t judge me too harshly for any problems.

Thank you!

