Profile
I am a machine learning enthusiast, Data Scientist, Python programmer, and former network engineer. I focus on applied science, sustainability and open source solutions.
Education
Hierarchical density-based clustering and interpretation for network measurements.
- The aim was to detect and interpret unknown patterns in passive/active network/Cloud measurements by hierarchical density-based unsupervised machine learning techniques.
- Bc.: Measurement of glottal period of human voice
- Measurement, automatic marking and analysis of glottal period of human voice
- Ing.(Masters): Classificators for identification of the speaker
- Building on top of the Bc. results by identification of speakers by glottal period and other aux. features
Open Source contributions
Collaboration on open source deep learning library pytorch-widedeep:
Collaboration on Data Science related tasks under the username Pavol86, please see the Phabricator tickets and related git merges:
Recent Experience - work
Analysis and classification of data from diverse security tools and systems. Pattern, anomaly, and potential security threats detection. Research and application of machine learning methods (NLP, LLM, statistics) to enhance and enrich solution outputs.
Design and analysis of machine learning approaches for pattern and anomaly detection in synthetic & real-world data. For related projects see the Projects section. Related projects:
- SUCCESS-6G: Towards robust, secure and computationally efficient vehicular services in 6G
- git repo - https://5uperpalo.github.io/success6g-edge/
- description - data analysis, design, and implementation of the solution for AIoT in V2X
- used technologies - Python(numpy, pandas, matplotlib, etc.), MySQL, MLflow, Kubeflow, Kubernetes(Minio, Prometheus, Zero-to-Jupyterhub, Flower, Kserve, Knative, Istio, Kepler, Grafana, InfluxDB), streamlit, models(LLMs/Huggingface, Transfrormers, GBMs, sklearn, torch, pytorch-widedeep)
- FIREMAN (Framework for the Identification of Rare Events via MAchine learning and IoT Networks)
- git repo - https://github.com/5uperpalo/FIREMAN-project
- description - data analysis and implementation of the solution for classification of rare events in IIoT
- used technologies - Python(numpy, pandas, matplotlib, etc.), models(LLMs/Huggingface, Transfrormers, GBMs, sklearn, torch, pytorch-widedeep, HDBSCAN, OPTICS, DBSCAN, GANs, GAIN, scikit-multiflow/river, deep-river), MOA, Kafka, Airflow, MLflow, KSQL, Faust, InfluxDB, Flask
Design, analysis and impl. of machine learning approaches for (i) prediction of customer lifetime value in mobile apps and (ii) in-app purchase recommendation system. Related projects:
- IAP (In-App-Purchases)
- PLTV (Predicted LifeTime Value)
- description - data analysis, design and implementation of solution to predict life time value of customers from aggregated data from their early in-app behavior, and additional auxiliary data
- used technologies - Python(numpy, pandas, matplotlib, etc.), Athena/SQL, AWS, models(LLMs/Huggingface, Transfrormers, GBMs, sklearn, torch, pytorch-widedeep, H20, DBSCAN, OPTICS), Weights&Biases,
Experience - work
Design and development of communication interface between Slovak Electricity Hydro optimization model and user GUI. Docker containerized solution utilizing Redis for in-memory database, Flask as web framework and Celery for multi-processing. Design of code unit testing. Transformation of procedural code to object oriented. Code refactorization.
Migration of existing Apache XML firewall/loadbalancer solution to F5 loadbalancer for a Ministry of agriculture project
- Network Engineer (VSHosting, Prague, CZ)
- Network Consulting Engineer (Verizon, Prague, CZ)
- Senior System Engineer (AT&T, Bratislava, SK)
- HP Radia Specialist (Soitron, Bratislava, SK)
- HP Monitoring Support Specialist (Soitron, Bratislava, SK)
- IT VoIP support specialist (Soitron, Bratislava, SK)
Experience - internships
Application of the unsupervised machine learning (ML) approaches to network (NW) traces (MAWI, Darknet). Generalization and improvement of the hierarchical density-based clustering approach to NW measurements interpretation proposed during AIT Vienna internship. Improvement of PySpark ML scripts running in distributed UX server environment. Results were summarized in conference papers.
Analysis of the relations between socioeconomic status of customers and network performance, and investigation of potential discrimination in network deployment and management. Correlating LSOA database (Lower-layer Super Output Areas) and operator measurements by Geographic Information System (QGIS, ArcGIS, GeoPandas) in distributed computing env. (PySpark).
Cybersecurity and network performance analysis, anomaly detection and diagnosis. Application of supervised, unsupervised, batch and stream-based machine learning techniques on big network measurement datasets (MAWI and Cloud latency). Integration of machine learning approaches into big data analytics platforms - in particular, working on a distributed computing environment within the BIG-DAMA project. Utilization of distributed computing tools and platforms such as Cloudera, PySpark, Apache Pig, Hive, Kafka, Elasticsearch etc.. Running and configuration of machine learning bash script on linux server. Results were summarized in conference papers.
Experience - pedagogical
Mentoring, academical support and provisioning of computing environment for under-grad intern visiting NII Tokyo for three weeks supported by Sakura Science Plan internship.
Network operating systems, Linux, Unix. Administration and network tools, managing and administration of documentation. Basic concepts, configuration and procedures in operating systems administration (UNIX).
Basic principles of both classical and programmable logic devices and their practical use in the design of digital systems. Design and implementation of digital circuits VHDL language. Implementation of logic gates, measurement of their static and dynamic properties. Verification of digital circuits in the simulator.
Review of switching systems solution principles, i.e. (i) switching fields, (ii) control systems and (iii) signalization for switching control (in central office as well in networks). Focus on digital switching systems with circuit commutation as well as transport of IP packets. Basic review and consideration about convergence of voice and data services and networks including functional principles of new generation networks with respect to philosophy and services of intelligent network.
Certifications (expired)
- Cisco Certified Network Associate (CCNA, 640-802)
- (640-553) Implementing Cisco IOS Network Security
- (640-460) Implementing Cisco IOS Unified Communications
- (640-721) Implementing Cisco Unified Wireless Net. Essentials
- Cisco Certified Design Associate (CCDA)
- (640-863) Designing for Cisco Internetwork Solutions
- Cisco Certified Network Professional (CCNP)
- (642-901) Building Scalable Cisco Internetworks
- (642-812) Building Cisco Multilayer Switched Networks
- (642-825) Implementing Secure Converged Wide Area Networks
- (642-845) Optimizing Converged Cisco Networks
- Cisco Certified Internetwork Professional (CCIP)
- (642-642) Quality of Service
- (642-611) Multiprotocol Label Switching
- (642-661) Border Gateway Protocol
- Cisco Certified Design Professional (CCDP)
- (300-320) Designing Cisco Network Service
- (642-873) Designing Cisco Network Service Architectures
- Conducting Cisco Unified Wireless Site Survey (CUWSS, 642-731)
- Implementing Cisco Edge Network Security Solutions (SENSS, 300-206)
- F5 Certified Product Consultant for LTM (F5-PCL, F50-531)
- F5 Certified Administrator
- (101) Application Delivery Fundamentals
- (201) TMOS Administration
- Juniper Networks Certified Internet Associate EX (JNCIA-EX, JN0-400)
- Information Technology Infrastructure Library Foundation in IT Service Management (ITILv3, Foundation)
- The Open Group Architecture Framework (TOGAF 9)
- ArchiMate 3
Other projects
- Metrics for Automated Detection of Cloud Anomalous Behavior} focused on automated detection and interpretation of suspicious events in active Cloud latency measurements
- Practical Privacy-Preserving Data Collection and Utilization using Provable Cryptographic Tools
- Privacy Protection and Machine Learning Utilization of IoT Data in Cloud
- Cloud Performance Analysis and Improvement
- Smart-home IoT and Cloud Telemetry Datamining
- Methods Enhancing Work with Cloud Data
- SUHEC (SUrname HEritage Classifier)
- git repo - https://5uperpalo.github.io/surname_heritage_classifier/
- description - A hobby project to classify surnames to countries and areas of the world. An attempt for an open source alternative to paid services.
- CHURNPRED (CHURN PREDictor)
- git repo - https://github.com/5uperpalo/churnpred
- description - Customer churn predictor.