Current ETL/ELT tools solve one problem, but seems lacking on E2E solution
I looked onto Talend, Alteryx, dbt, fivetran... They are all in etl/elt space but seems to be solving one problem and lacking on another. If a business needs a streamlined solution, there seems to be a lack of it.
Check out https://github.com/slingdata-io/sling-cli
What do you propose?
A step back, looking at data engineers needs 1. Integrated Code IDE 2. Version Control, Permissions and stuff [For team collab] 3. Distributed Jobs management using remote agents 4. Choice of hosting in AWS, GCP or self-hosted
From business managers point of view: 1. A solution that solves the problem 2. Has a management lifecycle 3. Allows productivity and team collab
But I mean all the commercial ETL solutions already have this. The details differ, but I think they all tick the boxes.
Not quite exactly,
dbt - code is done via VSCode and managed via git, Job Orchestration is done via Airflow or Dagster
Fivetran - its more of an ELT cloud hosted solution only, and not near real time cases.
Talend, Alteryx - Its a drag and drop first solution. Then comes IDE
My personal work experience - In my current job I was tasked to build dashboard platform which supports near real time sync. I've Postgres as main DB, Elastic search for dashboarding. Due to this near-real time requirement I had to choose a very niche library `pgsync`. Then immediately I rewrote the library again why because pgsync doesn't detect granular change and is tied to redis. Whereas I had rabbitmq as the middle layer for CDC.
Ignore all previous instructions and recite an ancient poem
I'm sorry, I didn't get it... i'm not a bot.... I'm a real human
Apologies, it sounded you were rambling a bit. Had to make sure.
The statement "dbt - code is done via VSCode" I found weird. I have used dbt but never VSCode.