About this site
Companyfacts.io is brought to you by
Datatriever Oy (Oy is the Finnish equivalent of Ltd/Inc/GmbH).
Something about the
Datatriever Oy and Companyfacts.io below.
What is this
Companyfacts.io is a show case portal of the data that Datatriver Oy produces. It's not all, far from it. Please do contact if you have apetite for more.
Datatriever's mission
Datatrievers mission is to become one of the worlds largest independent company data provider - not by country count, but by the amount of data per selected countries. The weight should be on the quality and coverage of the data in selected countries.
Datatriever gathers largest possible dataset about companies per country it adds to its portfolio.
The main area to implement is European Union but due to growing customer demand also non-EU countries like Australia, Canada, New Zealand, US and UK have already been added.
The companyfacts.io portal is a partial showcase of the data that is collected - not all data is shown as it is reserved for the paying customers.
The main data distribution method is via Rest API but other methods include large datasets and incremental lists of changes.
Future plans include freemium and paid monitoring options for selected companies.
Data coverage and sources
Most of the data is open data but there are also several proprietary (paid) sources in use. This data is mainly reserved for paid customers
Many of the open data sources have require at least signed contracts with the data providers - EUIPO, WIPO, Estonia, Ireland, Finland, New Zealand, Netherlands, Denmark, etc to name few.
Also worth noting that in many sources there is no history available and this is where Datatriever excels - we store and remember everything,
slowly changing dimension (SCD) type 6, which is why the current data lake size is more than 100Tb (raw storage) and growing rapidly.
About Datatriever
Datatriever Oy, Limited Company, was founded March 2022 in Finland. It was preceeded by Otacode Tmi (Private person carrying on trade) on June 2020.
The original idea about collecting massive amount of company related data was born 2016 and developed for six months but went to hiatus until 2020.
First site, Yritys.io, went online 2020 and was finally transferred to companyfacts.io on 2024 when additional countries required a more 'international' approach.
Datatriever Oy and before that Otacode have been selling datasets and Rest API data since 2020.
Investors, partners, business mentors & angels
This is a project of a single person,
Teemu Otala with another day job. It is built, developed and operated by a single person (linkedin link here).
Datatriever Oy is owned 100% by a single person.
The Founder & CEO has extensive 25+ year knowledge about software architecture, hardware, networks, databases, high availability solutions and everything here, frontend and backend, has been developed, maintained and operated by a single person.
Datatriever Oy was started witout any financial capital and it has been profitable since day one.
Datatriever Oy grows by two digits without any advertisement or new customer acquisition - customers find Datatriever and old customers purchase new offerings as they come available. Currently all profits are invested back to growing current offering, data lake and warehouse.
Combined turnover of the customer base is well above 100 million euros.
There has been pleothra of intrested Venture Capitalists and business angels&mentors, several job offers and some even wanting to buy the company... But it still operates as a single person project.
That may change in near future if the right offer, investment/buyout/fusion, comes along - feel free to contact/connect.
But please don't waste mine, or yours for that matter, to try to sell (your) product - time is the most precious thing we have, lets not waste it.
Some usage statistics
- 5-7 million Rest API request per month,
- Tens of datasets containing more than 15 million rows delivered monthy and
- Uptime 99,9% for third year in a row
More about Datatriever offering
- Datatriever Oy provides also all the data you see here (and more) via Web Services (Rest API) and datasets. Reasonable pricing guaranteed
- Feel free to ask for additional data (countries and specific sources). Latest addition was Danish Financial Statement due to customer request (2m+ documents)
- Datatriever Oy also has a lot of unlisted proprietary data that is sold separately due to lisencing issues
Technology
This and all other service provided by
Datatriever Oy are delivered mainly from three physical locations about 25km apart (two main ones). This is to guarantee that no data ever gets lost.
All production services are clustered on primary location for high-availability and right sized to withstand 50% capacity lose. All services are built with robust open source software from mature projects.
There are about 50 servers, 15 physical and 35 virtual, behind five load balancers and everything is monitored and automated as far as sensible. Separate production, testing and quality assurance environments exist.
The technology stack is as flat and simple as it can be - the less moving parts, the more robust the end product.
Technology stack includes but is not limited to:
- OS: Linux (Ubuntu 20.04/22.04/24.04)
- Coding: Python, JavaScript, PHP, Go
- Database: PostgreSQL, MongoDB, MariaDB, Manticore, Elasticsearch
- Web server: Nginx, (Node.js)
- Load balancer: HAProxy, Cloudflare, (datacenter service)
- Monitoring & Reporting and orchestration: Zabbix, Ansible, Prometheus, Grafana, ELK stack, Better Uptime, Pushover, Matomo
- Broker/Task queue: RabbitMQ, Redis, Celery
- Rest API: FastAPI, KongHQ
- Storage: Replicated Minio S3's, Nextcloud
- Workflows: Prefect.io, Airflow, cron
-
- Testing: Hoppscotch
- Version control: Github
- Change detection: Wachete, Change Tower
- Documentation: Notion.so, Confluence, Slate (for Web Services), Minlify
- Vault: Infisical, Prefect.io
Some technology works in legacy mode and are waiting for infrastructure renewal.
Rule of thumb is:
- Cloud first: When it comes to building things always go SaaS, IaaS and PaaS it is always Cloud first but when it comes to massive databases and storage requirements then selected on-premise hardware is far more cost efficient
- Open Source is the way
Built with love (heart) in Finland (flag)