XML - Python Financial Web Scraping Bash Script based on Selenium( GitHub )
Project Overview
Created a Cron-automated XML - Python Financial Web Scraping Script based on Selenium
Created a Python program to scrape and process financial securities data based off Clearing Corporation of India, based on Selenium and ChromeDriver automation to download the data via XML.
Parsed the scraped Pandas Dataframe of securities data into a CSV file.
Utilised Cron in Ubuntu Linux to automate the Python script to run and update the securities data CSV file every minute.
Utilised the pyodbc driver to truncate outdated data within the dbo.Securities_Wise_Holdings table in the Microsoft Azure SQL database.
Bulk Copy updated securities data CSV file via BCP Utility command to update Microsoft Azure SQL cloud database every minute.
Gallery
Using Selenium and ChromeDriver to automate scraping of XML financial data from datatables
Automation of Python script in action in Ubuntu Linux
Cron scheduling in Ubuntu to run script and update CSV file every minute
BCP (Bulk Copy Program) Utility to bulk insert updated financial data into Microsoft Azure SQL cloud database every minute
XML - CSV File Conversion
Convert files between XML and CSV / XSLX format( GitHub )
Project Overview
Created Python scripts to convert between hierarchical XML and CSV / XSLX format using ElementTree and json
Created a Python script to convert a multi-sheet XLSX document to XML format with its data and corresponding XPaths using ElementTree.
Hierarchical XML format for each data entry is guided by recursive creation of child nodes from the root node.
Created a Python script to convert a nested hierarchical XML file into a multi-sheet XLSX document with its data and corresponding XPaths.
Converted the XML file into a json dictionary, and obtained the relevant data entries within the dictionary.
Parsed the values and XPaths into separate Pandas Dataframes and converted the Dataframes into a single multi-sheet XLSX document.
Gallery
Conversion of multi-sheet XLSX document into hierarchical XML file format
Conversion of nested XML file into XLSX file sheet with data entries
Conversion of nested XML file into XLSX file sheet with corresponding XPaths