Philosophy¶
Why WowData™?¶
Most data tools assume that if you are touching data, you are already an expert.
WowData™ rejects this assumption.
We believe that:
-
data engineering is a foundational skill,
-
learning should be fast and durable,
-
and tools should teach rather than intimidate.
WowData™ is designed so that anyone can build, read, and reason about a data pipeline—even when that pipeline performs non-trivial work.
Core Concepts¶
WowData™ is built on four stable, universal concepts:
-
Source — where data comes from
-
Transform — how data changes
-
Sink — where data goes
-
Pipeline — how everything fits together
These concepts are deliberately persistent and reused everywhere.
We avoid hidden helpers, magical shortcuts, and proliferating abstractions.
If you understand these four ideas, you understand WowData™.
Example¶
from wowdata import Source, Transform, Sink, Pipeline
pipe = (
Pipeline(Source("people.csv"))
.then(Transform("cast", params={"types": {"age": "integer"}, "on_error": "null"}))
.then(Transform("filter", params={"where": "age >= 30 and country == 'KE'"}))
.then(Sink("out_filtered.csv"))
)
pipe.run()
This pipeline:
-
reads a CSV file,
-
explicitly casts a column,
-
filters rows using a small, teachable expression language,
-
and writes the result to disk.
Nothing is hidden. Nothing is inferred without consent.
Learning-First by Design¶
WowData™ makes deliberate trade-offs to accelerate learning:
-
A small, closed vocabulary of concepts
-
A restricted expression language that can be mastered quickly
-
Explicit transforms instead of silent automation
-
Deterministic checkpoints for inspection
-
A serializable pipeline that can be read like a recipe
If a feature makes learning harder—even if it saves time—we reject it.
Example: Learning-First Explicitness¶
from wowdata import Source, Transform, Sink, Pipeline
pipe = (
Pipeline(Source("people.csv"))
.then(
Transform(
"cast",
params={
"types": {"age": "integer"},
"on_error": "null" # explicit choice, not hidden behaviour
}
)
)
.then(
Transform(
"filter",
params={
"where": "age >= 18"
}
)
)
.then(Sink("adults.csv"))
)
pipe.run()
pipe.to_yaml("pipeline.yaml")
Serialization That Humans Can Read¶
Every WowData™ pipeline can be serialized into a human-readable form.
Serialization is not configuration for machines—it is a cognitive artifact for people.
Our goal is that an ordinary user can:
-
read a serialized pipeline,
-
understand what it does,
-
explain it to someone else,
-
and safely modify it.
Example: Human-Readable IR Serialization¶
The same pipeline can be represented as a simple, inspectable Intermediate Representation (IR) saved as pipeline.yaml:
wowdata: 0
pipeline:
start:
uri: people.csv
type: csv
steps:
- transform:
op: cast
params:
types:
age: integer
on_error: null
- transform:
op: filter
params:
where: "age >= 18"
- sink:
uri: adults.csv
type: csv
This IR is deliberately verbose and stable. It is designed to be read, reviewed, versioned, and edited by humans — not generated once and forgotten.
Because the IR mirrors the core concepts (Source → Transform → Sink), anyone who understands WowData™ can understand what this pipeline does.
Example: Loading a Pipeline from IR (YAML)¶
A serialized pipeline can be loaded back into WowData™ and executed:
from wowdata import Pipeline
pipe = Pipeline.from_yaml("pipeline.yaml")
pipe.run()
This allows pipelines to be: - authored or reviewed as YAML, - stored in version control, - shared between users or systems, - and executed without modifying Python code.
The same IR is used by both the programmatic API and future graphical tools.
Errors That Teach¶
In WowData™, error messages are part of the interface.
Every user-facing error:
-
explains what went wrong,
-
explains why it happened,
-
suggests what to do next.
Errors are designed to teach correct mental models, not expose internal stack traces.
Example: Errors That Teach¶
from wowdata import Source, Sink, Pipeline
pipe = Pipeline(Source("missing.csv")).then(Sink("out.csv"))
pipe.run()
produces the following error
wowdata.errors.WowDataUserError: [E_SOURCE_NOT_FOUND] Source file not found: 'missing.csv'.
Hint: Check the path, working directory, and filename. If the file is elsewhere, pass an absolute path.
Built on the Best¶
WowData™ does not reinvent proven tools.
Instead, it piggybacks on best-in-class ecosystems:
-
mature ETL engines,
-
established data modelling and validation frameworks,
-
battle-tested execution backends.
Our contribution is an opinionated, human-centred layer that makes these tools usable by more people.
Open Source, Unapologetically¶
WowData™ is open source by principle, not convenience.
We believe that tools shaping how people think about data must be:
-
transparent,
-
inspectable,
-
extensible,
-
and owned by the community.
What WowData™ Is Not¶
WowData™ is not:
-
a low-code gimmick,
-
a black-box automation tool,
-
a thin wrapper around someone else’s API,
-
or a system that only experts can use safely.
If forced to choose between power and clarity, we choose clarity.
WowData™ is not trying to make data engineering smaller.
It is trying to make it thinkable.