As a Rust Software Engineer at Qwant, one of Europe’s leading search engines, I focus on developing and optimizing web-scale crawling systems and content processing pipelines.
My significant contributions include developing a high-performance HTML content extraction service that matches the quality of industry standards while operating at substantially faster speeds. I architected a sharded, lock-free crawling system that dramatically improved performance while maintaining strict website politeness policies. The system includes sophisticated robots.txt handling with caching and parsing capabilities for URL filtering.
My work extends across the crawler’s core infrastructure, encompassing URL redirection management, filtering systems, and comprehensive end-to-end testing, all contributing to Qwant’s robust search engine infrastructure.