What happens when you grow tired of all the ‘ghost job’ hiring posts on the world’s most popular platforms like LinkedIn and Indeed? You either try to live with the issue and try your luck by applying for all postings, or turn to AI to make life a lot easier. One man, who shared his journey on Reddit, reveals how he utilised ChatGPT to scrape through 4 million jobs on popular job platforms.
The tech-savvy individual, called hamed_n, has created a new job search tool by leveraging the power of the ChatGPT API. The creator announced on the r/ChatGPT subreddit that they have successfully scraped over 4.1 million real-time job listings directly from company websites, leading to the creation of their new platform, Hiring.Cafe.
“I’ve now used this technique to scrape 4.1 million jobs (with over 220k remote jobs) and built powerful filters,” said the developer.
ChaGPT API makes job hunting easy
The project was created to address a common frustration among job seekers: the difficulty of finding legitimate, open positions that aren’t old, filled, or simply being used for data collection by recruiting agencies. The idea behind the project was to use the ChatGPT API to overcome a huge technical hurdle.
While large companies post jobs on their own sites, the formats of these job listings are highly inconsistent, making traditional scraping difficult. The creator used ChatGPT to parse these unstructured job descriptions from LinkedIn and other job portals as well as personal websites, and convert them into a clean, searchable JSON format, a process that proved highly effective.
The result is a custom job board that provides advanced filtering options for job seekers, including search by title, industry, and years of experience. The Redditor highlighted that because the listings are pulled directly from the source, they are far more likely to be genuine and actively hiring positions.
Users had concerns about AI
In a follow-up discussion in the comments section, the developer clarified how they verified the legitimacy of the companies, stating they used third-party databases like Apollo.io and Dunn and Bradstreet to find and authenticate the businesses.
While a few users raised concerns about the potential for hallucinations or errors in the AI-generated data, the developer and others responded by noting that while minor inaccuracies, such as a job’s remote status, may occur, the fundamental problem of “ghost jobs” has been significantly reduced, as the jobs themselves are real and posted directly by the companies.