Paste a magnet link. Get a direct HTTPS download.
No P2P clients. No firewall issues.
DataCrate AI is an open-source dataset provisioning platform built for the academic community. It bridges the gap between publicly available research datasets distributed via BitTorrent and the restrictive network environments found in most universities and research institutions.
Researchers and students frequently need access to large-scale datasets — ImageNet, Common Crawl, LAION, The Pile, and countless others — that are distributed as torrents. Unfortunately, institutional firewalls and network policies almost universally block peer-to-peer traffic, forcing researchers to find workarounds, use personal connections, or simply give up.
DataCrate accepts a magnet link, fetches the dataset through a cloud-based caching layer, and provides direct HTTPS download links that work on any network — no P2P clients, no VPNs, no firewall exceptions required.
DataCrate is designed for university researchers, graduate students, and academic staff who need fast, reliable access to open research datasets. While currently serving the University of Calgary community, the platform is fully open source and can be deployed by any institution.
DataCrate AI is released under an open-source license. Universities, labs, and research groups are welcome to fork, deploy, and adapt the platform for their own communities. Contributions, bug reports, and feature requests are encouraged.
For questions, issues, or deployment support, please open an issue on the project's GitHub repository or reach out to the maintainers through your institution.
Last updated: July 2026
DataCrate AI is committed to protecting the privacy of its users. This platform is operated as an open-source academic tool and collects only the minimum data necessary to provide dataset download services to authorized university members.
All data is stored in a PostgreSQL database on infrastructure controlled by the platform administrators. Passwords are hashed using bcrypt with a cost factor of 12. API communication uses HTTPS encryption. JWT tokens are used for session management and expire after 7 days.
Account data is retained as long as your account is active. Dataset request records are kept for operational purposes. You may request account deletion by contacting an administrator, which will remove your account and all associated dataset records.
DataCrate uses TorBox as a cloud-based torrent caching service to convert magnet links into HTTPS downloads. When you submit a magnet link, it is forwarded to the TorBox API. Please refer to TorBox's own privacy policy for their data handling practices.
Because DataCrate is open source, you can inspect exactly what data the platform collects and how it is processed by reviewing the source code. Self-hosting institutions are responsible for their own privacy practices and should adapt this policy accordingly.
This privacy policy may be updated as the platform evolves. Material changes will be communicated through the platform interface.
Last updated: July 2026
By creating an account on DataCrate AI, you agree to these Terms & Conditions. If you do not agree, please do not register or use the platform.
DataCrate is available exclusively to current students, faculty, staff, and alumni of participating universities. You must register with a valid institutional email address (e.g., @ucalgary.ca or @alumni.ucalgary.ca). You are responsible for maintaining the confidentiality of your account credentials.
You agree to use DataCrate only for lawful academic and research purposes. You may not:
DataCrate is intended for publicly available research datasets. This includes datasets published by research institutions, academic organizations, and open-data initiatives that are distributed via BitTorrent for bandwidth efficiency. Users are responsible for ensuring they have the right to access any dataset they request.
DataCrate is provided "as is" without warranties of any kind. We do not guarantee:
The platform operates on shared infrastructure with pooled API keys. To ensure fair access for all users, administrators may impose limits on concurrent downloads, total storage, or request frequency. Abuse of shared resources may result in account suspension.
Administrators reserve the right to suspend or delete accounts that violate these terms, abuse shared resources, or are no longer affiliated with a participating institution. Users may also request voluntary account deletion at any time.
DataCrate, its maintainers, and hosting institutions are not liable for any damages arising from the use or inability to use the platform, including but not limited to data loss, download failures, or service interruptions. The platform is a community tool provided at no cost.
The DataCrate platform software is open source. Institutions deploying their own instances are responsible for establishing their own terms of service. These terms apply specifically to this hosted instance.
These terms may be updated at any time. Continued use of the platform after changes constitutes acceptance of the revised terms. Material changes will be communicated through the platform interface.
Last updated: July 2026