Case Study

Routine Process Automation PlatformScalable Internal Automation for Data-Driven Teams

The Routine Process Automation (RPA) platform is an internal system developed to handle end-to-end product data workflows, including file-based supplier data ingestion, supplier website scraping, data cleaning and transformation, and importing cleaned data into the internal SQL Server (SSMS) database.

Internal Automation Team/R&D Department
Pharmaceutical, Chemical, Data & Research Services
Hero-Image

Project Detail

The Routine Process Automation (RPA) platform is an internal system developed to handle end-to-end product data workflows, including file-based supplier data ingestion, supplier website scraping, data cleaning and transformation, and importing cleaned data into the internal SQL Server (SSMS) database.

Client Overview

The Internal Automation Team serves a data-driven organization managing large-scale product data pipelines for research, sourcing, and market intelligence.

Faced with increasing supplier and product volumes, the team required an intelligent, scalable automation system to eliminate manual scraping, data cleaning, and import processes, enabling operational efficiency and consistency across workflows.

Industry

Pharmaceutical, Chemical, Data & Research Services, Enterprise Automation Solutions

Project Type

Enterprise Internal Automation, Robotic Process Automation Ecosystem (RPA), and Data Cleaning & ETL Pipelines.

Technologies

.NET (C#)
React.js
Python
Flask + Flask-SocketIO
SQL Server
MySQL
FileManager API
Docker
IIS

Major Features Delivered

Comprehensive solutions designed to enhance user experience and drive business growth.

Upload Interface

Allows users to upload Excel/TXT supplier files, select the relevant supplier module, and choose the processing server. This ensures files are routed correctly and processed efficiently.

Distributed Scraper Clients

Supports standard, VPN, and Tor-based clients, enabling resilient scraping while accessing region-restricted supplier websites and rotating IPs for reliable data collection.

File Manager Integration

Integrates with a centralized FileManager API to manage all input and output files systematically, ensuring documents are traceable, organized, and securely stored.

Middleware Server Scheduling

The middleware server dynamically schedules and distributes scraping and processing tasks across clients based on availability and task type, maximizing system throughput.

Automated Cleaning & Importing

Automatically cleans, standardizes, and transforms scraped data before importing it into SQL Server, ensuring data consistency and readiness for analysis and operations.

Logs & Status Tracking

Provides real-time visibility into the status of each automation task, detailed logs for every step, and progress tracking via the RPA web portal, ensuring transparency and easy troubleshooting.

Advanced Processing Extensions

Supports image generation, email validation, and chemical information enrichment using PubChem, expanding the system's capabilities beyond basic scraping and importing.

Downloadable Output

Users can view and download cleaned and processed output files directly from the platform, which simplifies integration and integration into downstream workflows.

Challenges & Solutions

We identified key pain points and developed targeted solutions to transform the resort's digital presence.

Challenges

Lack of Centralized Scraper Management

The challenge was the absence of a centralized system for managing scraper pipelines and logs.

Inconsistent File Formats

Inconsistent supplier file formats require dynamic processing logic.

Region-Restricted Supplier Websites

Accessing region-restricted supplier websites requires the use of a VPN or Tor.

File and Log Synchronization

Synchronization of files and logs between clients and servers.

Data Traceability

We ensure data traceability during the importation of cleansed data.

Scalability for New Suppliers

Scalability challenges with onboarding new suppliers and modules.

Solutions

Modular RPA Architecture

We designed a modular, scalable RPA architecture using Flask, SocketIO, and Docker.

Middleware Server

Built a middleware server to intelligently schedule and route tasks to appropriate clients.

VPN and Tor Scraping

Integrated VPN and Tor-based scraping for region-specific data extraction.

Chemical Enrichment

Enabled PubChem-based chemical data enrichment.

Centralized File Manager

Developed a centralized FileManager API for managing input/output files.

Log Tracking System

Built a log-tracking system for complete visibility and auditing.

Cleaner Engine & DataUploadClient

Developed a Cleaner Engine to convert raw scraped data into import-ready formats automatically. A DataUploadClient has been added to securely import cleaned data into SQL Server.

Project Snippets

Visual highlights showcasing the transformation and key features of the new website.

Projrct SnippetsQuick Visa
Projrct SnippetsQuick Visa
Projrct SnippetsQuick Visa
Projrct SnippetsQuick Visa
Projrct SnippetsQuick Visa
Projrct SnippetsQuick Visa

Ready to Build Something Amazing?

Let's discuss your project and create a custom web application that drives your business forward. Get started with a free consultation today.

Call us: +1-945-209-7691
Email: inquiry@mol-tech.us
2000 N Central Expressway, Suite 220, Plano, TX 75074, United States

Business Value Provided

Reduced manual data entry and processing time by over 80%.

80% Time Reduction

Reduced manual data entry and processing time by over 80%.

Parallel Processing for 100+ Suppliers

Enabled handling of 100+ suppliers in parallel.

Improved Data Accuracy

Improved data accuracy and consistency across product records.

Centralized Monitoring & Traceability

Centralized monitoring and traceability of all automation requests.

Scalable for New Modules

Scalable system for onboarding new modules and suppliers with minimal developer intervention.

Chemical Info Integration

Enhanced data enrichment for research and analysis with chemical info integration.

Connecting Continents, Empowering Businesses
Our branch offices ensure seamless support across the globe.
USA flagUSA
12
3
6
9
00:00
2000 N Central Expressway
Suite 220
Plano, TX 75074
United States
inquiry@mol-tech.us
+1-945-209-7691
Singapore flagSingapore
12
3
6
9
00:00
408 Joo Chiat Place
Singapore (428085)
inquiry@mol-tech.us
+65 8753 5833
Switzerland flagSwitzerland
12
3
6
9
00:00
Kirchmooshöhe 4
4800 Zofingen
inquiry@mol-tech.us
India flagIndia
12
3
6
9
00:00
5th Floor, 506,
Dwarkesh business hub
Opp. Visamo Society, Motera,
380005, Ahmedabad, Gujarat
inquiry@mol-tech.us
+91 81286 94374