Data's Blog

Amazon EMR’s architecture is designed to efficiently process vast amounts of data.

At its core, the architecture comprises three types of nodes:

1)---master, 2)---core, 3)---task.

The master node manages cluster tasks, distributing jobs to core and task nodes.

Core nodes are integral as they store data and run tasks, ensuring data persistence.

Task nodes, on the other hand, are optional and can be added to scale computing power, focusing solely on processing tasks without storing data.

This setup allows for flexibility in managing resources and cost, adapting to the workload by adjusting the number and type of nodes.

Choose Colour