My entry into data science began with a deep interest in mathematics and logic. With a solid foundation in probability theory, statistics, mathematical analysis, and linear algebra, I quickly grew comfortable working on advanced machine learning problems.
Long before the mainstream rise of large language models like GPT, I was exploring the practical applications of
deep learning in natural language processing. One notable project was a toxic comment classification system
for Russian social media. I used BERT
(via TensorFlow Hub) and CatBoost
for binary classification, achieving
85% accuracy on a Kaggle-labeled test dataset.
I further tested the model on real-world data scraped from VK via API, revealing the classic challenge of domain adaptation— the model performed worse on live data, which highlighted the critical need for a more diverse training set. The experiment was a valuable learning experience in model generalization and data strategy.
I also worked as a data scientist in a financial firm in Asia for a month, gaining practical industry experience and
refining my ability to extract insights under pressure. The work focused on real-time data handling and
evaluation pipelines, deepening my fluency with Pandas
, SQL
, CatBoost
, and business-related metrics.
Alongside self-driven projects, I completed formal courses like Codecademy's Data Scientist Career Path and Building Deep Learning Models with TensorFlow—before AI became the global buzzword it is today.
Technical Stack
- Languages: Python, SQL, HTML/CSS
- Libraries: NumPy, Pandas, scikit-learn, Matplotlib, Seaborn, CatBoost, XGBoost, LightGBM
- Deep Learning: TensorFlow, Keras, transformers, word2vec, fasttext, BERT
- NLP Tools: nltk, regex, BeautifulSoup
Data science taught me not just how to work with models, but how to think in systems—an approach that later shaped how I build software, structure content, and design entire platforms from the ground up.