Hi, I am Takreem Akhter

I am an Azure Data Engineer passionate about everything that is driven by data.

I write about Big-Data, Machine Learning and the services provided by Microsoft Azure in such areas.

Additionally, I document my ventures into this Data-Driven world.

  • How to: Copy delimited files having column names with spaces in parquet format using ADF

    Delimited files which have column names with spaces cannot directly be ingested in parquet format using the Azure Data Factory’s copy activity. We encounter the following error:

  • Intro to Massively Parallel Processing

    Massively Parallel Processing(MPP) databases have been around for decades, but their cost and the complexity of managing them has dropped tremendously in the last decade. The only option until recently was to self-host these databases, but more recently, they have migrated to the cloud.

  • Using Azure Stream Analytics to analyze IOT data

    I used a Raspberry Pi IoT online simulator to imitate a sensor that monitors and sends the air intake temperature and humidity of an automobile engine. I used this “IoT” data to analyze when the temperature and humidity is beyond a limit which hampers the engine’s fuel utilization efficiency. When the temperature or humidity crosses the permissible limit (Which is a assumed value here) for a certain time interval, I collect that data in a separate container which may be used for further analysis to estimate its affect on the engine’s life.

  • Using Azure Data Factory to analyze a Cars dataset

    Azure Data Factory can be used for various ETL/ELT and Data integration scenarios. Here, I am using it to process and transform data and publish it to Azure Data Lake Storage.

  • Data pipeline for the Trending YouTube Video dataset from kaggle

    The dataset for the Indian YouTube trending page is not proper to perform insightful analysis. So, I created a ETL pipeline to clean-up the dataset and make it easier to perform analysis. This post is all about how I created this pipeline using PySpark on the Databricks platform.

  • How I become a Microsoft Certified Azure Data Engineer Associate

    Microsoft describes Azure Data Engineers as people who:

  • How I cleared the AZ-900 (Azure Fundamentals) Exam

    Microsoft’s AZ-900 exam checks a candidate’s foundational knowledge in cloud computing and Azure services. It is meant for people new to this field, so even if you are not from a computer science background you should be able to clear it. This certification is a great way to prove your knowledge and zeal to learn.