Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency models are missing #49

Open
Vlad-Radz opened this issue Feb 19, 2021 · 2 comments
Open

Concurrency models are missing #49

Vlad-Radz opened this issue Feb 19, 2021 · 2 comments
Labels

Comments

@Vlad-Radz
Copy link

Vlad-Radz commented Feb 19, 2021

For a modern data engineer knowledge of concurrency models is important.

  1. A data engineer should know the difference between concurrency and parallelism.
  2. A data engineer should know the difference between task parallelism and data parallelism.
  3. Threads vs. processes. Example in Python: libraries threading vs multiprocessing, what are the differences, and what problems does Python have with threading.
  4. A pretty typical scenario for modern data integration: call n APIs each x sec / min / hours. How to do that with a good performance? One of the ways would be to use asynchronous programming.
  5. Actor model might be good to know as well.
  6. DAG (example: Apache Airflow) vs state machines (example: Amazon Step Functions) vs ... . Is actually covered by 'Data structures and algorithms', but maybe would be good to mention this as an example of how knowledge of them might be helpful for a data engineer.
  7. Parallel programming using techniques like CUDA on GPU.
  8. Functional programming is also 'nice to have' (but not obligatory).

If you agree on at least some of the points, I can prepare the text.

@alexandraabbas
Copy link
Contributor

Hey, these are really good points! I'll def consider adding these to the image when I update it next time. Feel free to create a PR and add it to the markdown version. Thanks a lot for the contribution!

@Vlad-Radz
Copy link
Author

Vlad-Radz commented Apr 17, 2021

Hey, thanks for the feedback! I will create the markdown version, sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants