Proyecto de procesamiento distribuido que analiza datos meteorológicos de Medellín (2023-2024) usando MapReduce en Hadoop. Calcula temperatura promedio y precipitación total por mes.
This Project aims to implement a **Hadoop MapReduce job in Pseudo-Distributed Mode** to determine the **feistiest Pokémon** based on their **type**. The job processes the Pokémon dataset ...
Abstract: MapReduce is a very popular programming model used to handle large datasets in enterprise data centers and clouds. Although various implementations of MapReduce exist, Hadoop MapReduce is ...
Python, R, Data Modeling, Data Warehousing, Athena, Talend, JSON, XML, YAML, Kubernetes, Docker, Snowflake, Tableau, Power BI, JIRA, Agile Methodologies, Data ...
Abstract: MapReduce has emerged as a strong model for processing parallel and distributed data for huge datasets. Hadoop an open source implementation of MapReduce has approved MapReduce widely.
MapReduce developers face a steep learning curve when first deploying and configuring a Hadoop cluster and later when verifying program correctness. Compounded by long execution times (measured in ...
Reporting and analysis drives businesses in making the best possible decisions. The source of all these decisions is the data. There are two types of data: structured and unstructured. Most recently, ...