Data Science Case Studies

Forecasting, evaluation, applied ML, and decision-support tooling built for messy real-world constraints.

This section covers the work where modeling and product meet: forecast quality systems, localization logic, tokenomics simulations, experimentation analysis, and interactive tools that help people think more clearly about probability and operational risk.

Forecasting Model evaluation Localization systems Interactive tools

World of Warships Ship Drop Simulator - A Fan-Made Probability Tool

An unofficial decision-support tool that turns drop-rate assumptions and pity rules into clear probability curves, expected values, and collection odds.

World of Warships ship drop simulator preview

Localization Is Not Translation - Context Cartridges for International Football

A localization pattern that keeps generated sports content culturally native without letting the model invent the wrong teams, references, or tone.

Locale controls for context cartridges

Beyond WAPE - Building a Forecast Quality Framework

A pragmatic evaluation system for deciding whether a model is safe to ship, not just numerically impressive.

Forecast quality framework flowchart

Post-Launch Lessons: Debugging Smart-Contract Rewards and the Year-1 Data Roadmap

What the first launch cycle exposed about reward bugs, reporting gaps, and the data roadmap needed to keep the system legible.

Year one data roadmap density chart

Slash and Secure - Designing Oracle-Node Penalties

A penalty-design exercise balancing operator incentives, fault tolerance, and the need to keep bad behavior economically unattractive.

Oracle slashing visualization

Investor Unlock Schedules in Python - Nine Scenario Simulation

A scenario simulator built to show how unlock design changes sell pressure, allocation dynamics, and governance conversations.

Investor unlock sell pressure chart

Unraveling the Impact of Chart Positioning on App Sales

A mixed econometrics and feature-importance study on how app rankings, impressions, and sales actually interact.

Chart positioning analysis preview

Advanced Forecasting with Prophet

A practical forecasting workflow for turning messy game-sales data into explainable predictions without building a bespoke model stack from scratch.

Prophet forecasting preview
Close

Localization Is Not Translation - Context Cartridges for International Football

1. Problem

Translation changes words. Localization changes meaning. Sports models trained on generic internet text will happily invent club references, wrong nicknames, or the wrong tournament tone unless the context is constrained explicitly.

2. Approach

I packaged locale knowledge into small, auditable context cartridges that define what the model is allowed to say and what it must avoid.

3. Evidence

Locale story panel
The cartridge structure made localization auditable instead of relying on faith in the prompt.
Constraints panel
QA signals next to the output made drift visible before it reached product or editorial users.

4. Outcome

The system made localization safer, faster to review, and much easier to explain to non-ML stakeholders because the constraints were explicit rather than buried in prompt text.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If you need LLM output to stay native-sounding without drifting away from the facts, I can help design the constraint layer and QA workflow.

Close

Beyond WAPE - Building a Forecast Quality Framework

1. Problem

A forecast can look acceptable on one headline metric and still be operationally dangerous. Teams need a way to decide whether a model is safe to ship, not just statistically convenient to present.

2. Approach

I defined forecast quality as a small set of auditable axes tied to failure modes: accuracy, bias, stability, tail robustness, and peak behavior.

3. Evidence

Forecast quality rollout flowchart
The staged rollout protected the team from turning evaluation into a second product before the core questions were answered.

4. Outcome

The framework made it much easier to compare models honestly and explain tradeoffs to commercial stakeholders who did not care about a metric zoo.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If your team is still shipping forecasts off a single error metric, I can help design an evaluation surface that reflects operational reality.

Close

Post-Launch Lessons: Debugging Smart-Contract Rewards and the Year-1 Data Roadmap

1. Problem

Launch changes the nature of the work. The main job becomes understanding failure modes quickly enough that stakeholders trust the system while the product is still learning how it behaves in the wild.

2. Approach

I treated the post-launch period as a diagnostic and roadmap exercise: identify the reward bugs, classify the missing observability, and sequence the first year of data work around the most important blind spots.

3. Evidence

Bug-fix burndown chart
The useful launch metric was not just bug count. It was how quickly the team could translate observed failures into durable instrumentation.
Roadmap density chart
The roadmap forced the team to prioritize observability and trust-building work instead of reacting ad hoc forever.

4. Outcome

The work produced a clearer year-one data roadmap and a more concrete understanding of which reward and reporting issues were transient versus structural.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If your launch data is telling you something important but the team cannot yet explain it clearly, I can help structure the diagnostics and the year-one roadmap.

Close

Slash and Secure - Designing Oracle-Node Penalties

1. Problem

Penalty systems have to do two jobs at once: deter bad behavior and avoid creating a regime where honest operators are punished into leaving. The modeling challenge is balancing credibility and survivability.

2. Approach

The design work treated slashing as an incentive system rather than a moral statement. I modeled how different penalty schedules interacted with operator behavior and the broader reward structure.

3. Evidence

Oracle node slashing visualization
The value of the exercise was clarifying where the penalty curve deterred bad behavior without making the system economically irrational for everyone else.

4. Outcome

The work helped turn abstract security discussions into a parameterized penalty design that engineering and governance could review together.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If you need to design penalties, safeguards, or incentive parameters without relying on gut feel, I can help build the model that frames the tradeoffs clearly.

Close

Investor Unlock Schedules in Python - Nine Scenario Simulation

1. Problem

Unlock schedules are often discussed as if the shape is obvious, but small changes in timing, tranche size, and allocation mix can change sell pressure dramatically. The team needed a scenario surface, not a static vesting table.

2. Approach

I built a compact simulator that crossed multiple unlock patterns and supply assumptions so the downstream effects on price pressure and allocation behavior could be compared quickly.

3. Evidence

Investor unlock sell pressure plot
The sell-pressure view turned a governance conversation about fairness into a discussion about actual system behavior.
Allocation pie charts
Allocation visuals helped show how the schedule changed who was bearing the economic weight over time.

4. Outcome

The simulator gave the team a fast way to compare alternatives and made vesting design more defensible because the scenarios could be challenged explicitly.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If your token, incentive, or vesting design still lives in a static spreadsheet, I can help turn it into a scenario model people can pressure-test.

Close

Unraveling the Impact of Chart Positioning on App Sales

1. Problem

Teams often treat chart ranking as a proxy for sales performance. The harder question is how ranking, impressions, and sales interact over time, and whether chart position is a driver, a consequence, or both.

2. Approach

I combined time-series analysis with feature-importance views to examine the relationship from more than one angle. That avoided the trap of declaring one clean causal story from a messy market system.

3. Evidence

Impulse response analysis for chart positioning
The impulse-response view helped show that the sales and ranking relationship was dynamic rather than a simple ladder effect.
Feature importance for Great Britain sales
Feature importance clarified which variables really drove the model rather than which story felt intuitively right.

4. Outcome

The study challenged the simplistic narrative that better ranking automatically means better sales and gave a more nuanced view of visibility, momentum, and market response.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If your growth analysis still depends on one proxy metric standing in for the whole market system, I can help build the analysis that disentangles it.

Close

Advanced Forecasting with Prophet

1. Problem

Not every forecasting problem needs a custom model stack. Sometimes the real challenge is building a reliable workflow around a strong baseline and being honest about what it can and cannot capture.

2. Approach

This workflow used Prophet as a practical forecasting baseline for messy game-sales data, with enough preprocessing and segmentation to make the outputs useful rather than decorative.

3. Evidence

Individual Prophet forecast
The forecast looked strong where the structure was stable and much weaker where promotions or event effects were missing from the model.
Aggregated Prophet forecast versus actuals
The aggregated view made it easier to discuss when the baseline was good enough and when the business needed richer features.

4. Outcome

The project showed how a relatively lightweight forecasting stack can still provide decision value when the workflow is disciplined and the limitations are explicit.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

If you need a forecasting baseline that is quick to ship and easy to explain before investing in something heavier, I can help set up the workflow and evaluation pattern.

Close

World of Warships Ship Drop Simulator - A Fan-Made Probability Tool

Note: this is an unofficial fan-made tool. It is designed for reasoning about probability scenarios, not for asserting official event odds or guarantees.

1. Problem

Loot-box style systems are hard to think about because intuition is terrible at compounding probabilities. People remember anecdotes, not distributions. I wanted a tool that made the odds, pity logic, and expected values visible enough to support an informed decision.

2. Approach

The simulator models a base drop rate, a configurable number of pulls, and an optional pity rule, then turns those assumptions into cumulative probability curves and summary statistics.

3. Evidence

Ship drop simulator main view
The main view turns a vague discussion about luck into a visible curve with concrete summary numbers.
Ship drop simulator with pity enabled
Pity settings change the shape of the curve in ways that are much easier to understand once plotted explicitly.

4. Outcome

The tool reframes a monetization mechanic as a probability problem. That is useful both for players thinking clearly about expected outcomes and for anyone designing consumer-facing systems that rely on repeated draws.

5. Tech stack

6. Useful links

7. Related reading

8. Call to action

Open the interactive simulator if you want to inspect the curves yourself, or get in touch if you need a similar decision-support tool for another probability-heavy system.