Senior Data Extraction Engineer
Location
Hyderabad, Bengaluru or Gurugram
Job Type
Full-Time
Experience Level
Associate level (1-3 Years)
Salary Range
Not disclosed
Job Description
We are looking for resourceful and exceptional candidates for the Data Engineer role within our product development teams based out of Hyderabad, Bengaluru and Gurugram. At DESIS, the Data Engineers develop Web Robots, or Web Spiders, that crawl through the web and retrieve data in the form of HTML, plain text, PDFs, Excel, and any other format that is either structured or unstructured. The job functions of the engineer also include scraping the website data into a structured format and building automated and custom reports on the downloaded data that are used as knowledge for business purposes. The team also works on automating end-to-end data pipelines. What you’ll do day-to-day As a member of the Data Engineering team, you will be responsible for various aspects of data extraction, such as understanding the data requirements of the business group, reverse-engineering the website, its technology, and the data retrieval process, re-engineering by developing web robots to automate the extraction of the data, and building monitoring systems to ensure the integrity and quality of the extracted data. You will also be responsible for managing the changes to the website's dynamics and layout to ensure clean downloads, building scraping and parsing systems to transform raw data into a structured form, and offering operations support to ensure high availability and zero data losses. Additionally, you will be involved in other tasks such as storing the extracted data in the recommended databases, building high-performing, scalable data extraction systems, and automating data pipelines. The ideal candidate should hold- Basic qualifications: 2 to 4 years of experience in website data extraction and scraping Good knowledge of relational databases, writing complex queries in SQL, and dealing with ETL operations on databases Proficiency in Python for performing operations on data Expertise in Python frameworks like Requests, UrlLib2, Selenium, Beautiful Soup, and Scrapy A good understanding of HTTP requests and responses, HTML, CSS, XML, JSON, and JavaScript Expertise with debugging tools in Chrome to reverse engineer website dynamics A good academic background and accomplishments A BCA/MCA/BS/MS degree with a good foundation and practical application of knowledge in data structures and algorithms Problem-solving and analytical skills Good debugging skills
About D. E. Shaw India Private Limited
D. E. Shaw India Private Limited (D. E. Shaw India) is a part of the D. E. Shaw group, a global investment and technology development firm. Established in 1996 as a technological development center, D. E. Shaw India has expanded its business into a broad range of software and financial activities. Its staff collaborates with colleagues around the world not only to build cutting-edge proprietary software systems that form an important part of the D. E. Shaw group’s investment activities but also to provide research and operations support to the firm’s systematic, discretionary, and hybrid strategies. An entrepreneurial spirit has always been a part of D. E. Shaw India’s culture. When external platforms don’t meet our needs, we build our own. We created one of the first NoSQL databases in our industry, built a powerful data analysis library for Python before pandas existed, and built a modern web engineering framework tailored to the firm’s needs. This profile and any links posted through this profile (together, the “Content”) are provided for your information only and do not convey investment advice or an offer of any type with respect to any securities or other financial products. The D. E. Shaw group does not endorse any information or beliefs discussed in any links posted through this profile and makes no representation as to the accuracy or adequacy of the Content. The Content has not been updated for any information that may have changed since publication. No assurances can be given that any aims, assumptions, or expectations expressed or implied in the Content were or will be realized, or that the activities described have continued or will continue at all or in the same manner as described.
Connections
Sai Charan
Senior Developer
Kalpana Sharma
Team Lead
Rahul Patel
Full Stack Developer
Priya Singh
Frontend Developer
Connect with professionals in your network
Skill Match Analysis
??% skills matched (?? of 25 skills)
💡 This is keyword matching for reference only. Your actual match score uses AI semantic analysis.
Login to see your score