Data Scientist Goals

Data Scientist Goals Examples: 64 Goal-Setting Actions for Data Scientists and Analysts

Turn raw data into decisions that drive measurable business outcomes while maintaining scientific rigor and communicating findings with clarity at every level of the organization.

8 pillars × 8 actions = 64 specific steps, adapted from the Harada Method used by Shohei Ohtani at age 16.

Document every analytical assumption
Report confidence intervals always
Publish a reproducibility checklist
Run a monthly internal workshop
Mentor a junior analyst weekly
Open-source a reusable utility
Master one new model class
Build a baseline before anything else
Validate on held-out time windows
Acknowledge data quality issues upfront
PHYSICAL
Refuse to p-hack results
Write one tutorial per quarter
FAMILY
Review pull requests from others
Run calibration checks on every classifier
FINANCIAL
Document feature engineering decisions
Correct past errors publicly
Separate exploration from confirmation
Maintain a personal methods log
Share raw notebooks, not just slides
Answer questions in public channels
Nominate peers for recognition
Conduct residual analysis on every regression
Compare models with proper statistical tests
Ship one interpretable model monthly
Add data quality checks to every pipeline
Profile every new dataset on arrival
Version control all training datasets
PHYSICAL
FAMILY
FINANCIAL
Start every project with the decision
Quantify the cost of a wrong prediction
Present findings without jargon
Monitor production data distributions
BUSINESS
Write idempotent transformation functions
BUSINESS
Turn raw data into decisions that drive measurable business outcomes while maintaining scientific rigor and communicating findings with clarity at every level of the organization.
AI
Connect every metric to revenue or cost
AI
Shadow a non-data team for one day quarterly
Reduce pipeline runtime by 20 percent
Create a data dictionary for every table
Handle missing data with explicit strategy
SYSTEMS
VOICE
BITCOIN
Build a recommendation, not just a report
Track the downstream impact of past work
Map the analytics value chain
Read one peer-reviewed paper per week
Implement one paper from scratch quarterly
Complete one structured course per quarter
Lead with the insight, not the method
Build one chart per key finding
Use color with intention
Track every experiment in MLflow or equivalent
Design A/B tests with adequate power
Automate hyperparameter search
Build a personal project portfolio
SYSTEMS
Attend one industry conference per year
Annotate charts with the conclusion
VOICE
Write a one-paragraph executive brief
Implement feature importance analysis
BITCOIN
Write a model card for every shipped model
Learn SQL at an advanced level
Study causal inference methods
Review your own code after 30 days
Present to a non-technical audience monthly
Collect feedback on every deliverable
Build interactive dashboards for recurring questions
Run one offline rollback test per quarter
Evaluate model fairness across subgroups
Kill underperforming models on schedule

Character Pillar: undefined

  • Before running any analysis, write down three assumptions you are making about the data and what would break the conclusion if any were wrong.You become the analyst who surfaces uncertainty before stakeholders ask, building a reputation for intellectual honesty over convenient answers.
  • Add confidence intervals or credible intervals to every point estimate you present this week, even in internal Slack posts.You become someone who never lets a single number masquerade as certainty, training your organization to think probabilistically.
  • Write a one-page checklist that any teammate can follow to reproduce your last completed analysis from raw data to final output.You become the standard-setter for reproducible research on your team, making every result auditable and trustworthy.
  • Add a data limitations section to your next report before the results section, listing at least two ways the data could bias the conclusions.You become the scientist who earns long-term credibility by disclosing problems voluntarily, not hiding them until they surface in production.
  • Pre-register your hypothesis and success metric in writing before running any A/B test or experiment, then share it with your manager.You become a practitioner of honest science inside a commercial environment, protecting the organization from decisions built on false positives.
  • When you find a mistake in a prior analysis, write a brief correction note and send it to the original audience, not just fix it quietly.You become someone whose corrections are seen as a sign of strength, not weakness, creating a culture where errors get surfaced fast.
  • Label every analysis as either exploratory or confirmatory in the first line of the notebook, and never use exploratory findings as confirmed proof.You become rigorous enough to distinguish pattern-hunting from hypothesis testing, preventing your team from acting on spurious correlations.
  • Keep a running document where you log every analytical decision made during a project and why you made it, updated daily during active work.You become the analyst whose reasoning process is as valuable as the output, allowing others to learn from your decision trail.

Karma Pillar: undefined

  • Schedule a 45-minute working session once a month to teach one statistical concept or tool to non-data colleagues using a real business dataset.You become the person who raises data literacy across the whole organization, not just the team that already understands statistics.
  • Block 30 minutes every week to pair with a junior analyst on their current work, asking questions rather than giving answers.You become a force multiplier whose influence on the organization grows through the people you develop, not just the models you ship.
  • Pick one internal helper function or pipeline component from your work this quarter and publish it to GitHub with a README and usage example.You become a contributor to the broader data science community, establishing your expertise publicly while giving away something of real value.
  • Publish a technical walkthrough on a topic you mastered in the last 90 days: pick a platform (personal blog, Towards Data Science, LinkedIn) and ship it.You become someone who processes learning through teaching, deepening your own understanding while building a public record of expertise.
  • Commit to reviewing at least two pull requests per week from teammates, leaving at least one substantive comment on methodology or code quality each time.You become a quality anchor on the team, lifting everyone's output while staying connected to work beyond your own.
  • After every major project, post the full Jupyter notebook to a shared team folder so colleagues can learn from the actual code, not just the summary.You become transparent in your craft, inviting scrutiny and learning rather than protecting the mystique of your work.
  • When a colleague DMs you a data or statistics question, ask them if it is okay to move the conversation to a public Slack channel so others benefit.You become a knowledge hub whose expertise scales beyond one-on-one conversations, building shared institutional knowledge over time.
  • Once a month, send a specific, written acknowledgment to your manager about a contribution a teammate made that deserves recognition.You become someone who celebrates the team's wins as loudly as your own, building the kind of culture where great people want to work.

Pillar 3: undefined

  • Pick one model type you have never shipped to production (gradient boosting, mixed effects, survival model) and build a working example on a real dataset this month.You become a data scientist with genuine breadth, choosing models because they fit the problem rather than because they are the only ones you know.
  • On every new modeling project, implement the simplest possible baseline (mean prediction, logistic regression, rule-based heuristic) and record its performance before touching anything more complex.You become someone who never over-engineers, always knowing whether complexity is earning its keep against a simple alternative.
  • For any time-series or sequential problem, implement walk-forward validation instead of random train-test split, and document the difference in performance metrics.You become a modeler who understands the difference between retrospective accuracy and real-world predictive performance.
  • After training any classification model, plot a reliability diagram and report the expected calibration error alongside AUC and precision-recall metrics.You become a practitioner who cares whether predicted probabilities are trustworthy, not just whether the model ranks correctly.
  • For every feature you create, write one sentence explaining the business intuition behind it and add it to a feature dictionary shared with the team.You become the modeler whose work can be maintained and improved by others, not a black box that only you can modify.
  • After fitting any regression model, produce a four-panel diagnostic plot (residuals vs. fitted, QQ plot, scale-location, leverage) and address any visible violations before presenting results.You become a statistician who treats model assumptions as real constraints, not formalities to skip under deadline pressure.
  • Use bootstrap confidence intervals or a paired test (McNemar, DeLong) when comparing two models, and stop declaring a winner based on a single holdout metric difference.You become rigorous enough to distinguish real performance gains from sampling noise, saving your organization from shipping models that do not actually improve on the prior.
  • At least once a month, intentionally choose an interpretable model (logistic regression, decision tree, scorecard) for a problem where a black box would have been easier to justify.You become a practitioner who values explainability as a feature, building trust with stakeholders who need to understand why the model said what it said.

Pillar 4: undefined

  • Add at least three automated assertions to your next data pipeline: row count within expected range, no nulls in key columns, no out-of-range values in numeric fields.You become a data scientist who catches data drift before it corrupts a model in production, rather than discovering it from a stakeholder complaint.
  • Before any analysis, run a full profile (pandas-profiling, ydata-profiling, or equivalent) on every new dataset and read the full output before writing a single line of model code.You become someone who respects data as the first constraint on every analysis, not an assumed input to skip past.
  • Set up DVC or a simple hash-based snapshot system so that every model training run points to a specific, reproducible version of the training data.You become a data scientist whose experiments are fully reproducible months later, not dependent on a dataset that may have changed.
  • For every model in production, set up a weekly report comparing the distribution of the top five input features this week versus the training distribution.You become someone who treats model deployment as an ongoing responsibility, not a one-time handoff to engineering.
  • Refactor your next data transformation so it produces identical output whether run once or ten times on the same input, and add a test that verifies this.You become an engineer-grade data scientist whose pipelines can be re-run safely without manual cleanup or fear of side effects.
  • Profile the slowest step in your most-used data pipeline this week, identify the bottleneck, and implement one optimization (vectorization, query pushdown, caching) to cut runtime.You become a practitioner who respects compute as a finite resource, building pipelines that scale without proportional cost growth.
  • For every table you own or regularly query, write a one-paragraph description of what the table represents and a one-line definition for every column you use.You become the person on the team who can onboard a new analyst in one hour, because the documentation exists and is accurate.
  • For your next project, write a missing data analysis section that counts missingness by column, tests whether it is MCAR or MAR, and justifies your imputation or exclusion decision.You become a scientist who treats missing data as a research question, not an inconvenience to be resolved silently with a default fillna.

Pillar 5: undefined

  • Before writing any code, write down in one sentence the specific business decision that this analysis will inform, and confirm it with the stakeholder before proceeding.You become a data scientist who never builds a model nobody asked for, ensuring every project has a clear path from output to action.
  • For every classification or regression model, build a cost matrix that assigns a dollar or percentage cost to false positives and false negatives, and use it to tune the decision threshold.You become someone who optimizes models for business outcomes, not just for accuracy metrics that look good in a notebook.
  • Draft your next slide deck and then rewrite every sentence that uses a technical term without a plain-English translation, until a non-data colleague can read it without stopping.You become the data scientist who gets their recommendations acted on, because the business can understand them without a decoder ring.
  • In every project report, add a section that translates the model improvement (lift, AUC gain, error reduction) into an estimated dollar impact using a clear, documented formula.You become someone who speaks the language of the business, making data science visible as a revenue driver rather than a cost center.
  • Once per quarter, spend a full day embedded with a sales, marketing, operations, or product team to understand how they make decisions and where data could actually help.You become a data scientist with genuine domain knowledge, surfacing problems worth solving rather than waiting for requests from people who do not know what is possible.
  • Change the last section of every analysis deliverable from a summary of findings to a specific recommended action with a stated confidence level and a proposed success metric.You become a decision partner to the business, not a reporting service, and your work starts appearing in strategy meetings rather than quarterly reviews.
  • Six weeks after delivering an analysis, follow up with the stakeholder to find out whether the recommendation was acted on and, if so, what happened.You become accountable for impact, not just output, closing the feedback loop that most data scientists leave permanently open.
  • Draw a one-page diagram showing how your team's data products connect to the company's key performance indicators, and identify one gap where data science could add a missing link.You become a strategic thinker who understands where data science fits in the business model, positioning your team for investment rather than budget cuts.

Pillar 6: undefined

  • This week, set up MLflow (or Weights and Biases, Neptune, or a structured spreadsheet) and log every model run you do this month: parameters, metrics, and artifact location.You become a practitioner who treats experiments as data, building an institutional record of what has been tried and what has worked.
  • Before launching any A/B test, run a power calculation to determine the minimum sample size needed to detect your target effect size at 80 percent power and 5 percent significance.You become someone who knows before running an experiment whether it can actually answer the question, stopping underpowered tests before they waste time and mislead the business.
  • Replace any manual grid search in your current projects with Optuna, Ray Tune, or scikit-learn's RandomizedSearchCV, and log the best parameters and the search budget to your experiment tracker.You become a systematic optimizer who searches the parameter space efficiently rather than relying on intuition or inherited settings.
  • For every model you ship, produce a feature importance chart using permutation importance or SHAP values (not the default model-specific importance) and include it in the model card.You become a modeler who understands what is driving predictions, building trust with stakeholders and catching spurious features before they cause problems in production.
  • Create a one-page model card for your next production model covering: intended use, training data, evaluation metrics, known limitations, and recommended monitoring thresholds.You become an accountable ML practitioner whose models have documented contracts, making handoffs to engineering and future maintenance predictable rather than chaotic.
  • Once per quarter, simulate reverting a production model to its previous version on historical data and measure how much performance degrades, confirming the rollback plan actually works.You become someone who plans for failure before it happens, maintaining operational confidence in the models your team relies on.
  • For every classification model applied to decisions about people, compute precision, recall, and false positive rate broken out by at least two demographic or behavioral subgroups before shipping.You become a practitioner who treats fairness as a technical requirement, not a philosophical afterthought, protecting the business from both ethical and regulatory risk.
  • Set a performance review date for every model at deployment. If a model has not beaten baseline on fresh data within 90 days, write a deprecation proposal rather than leaving it running indefinitely.You become a practitioner who manages a healthy model portfolio, clearing technical debt before it accumulates into a production liability.

Pillar 7: undefined

  • Rewrite the executive summary of your next report so it opens with the key finding and its business implication, and moves all methodology to an appendix.You become a communicator who respects that your audience needs decisions, not a tour of your analytical process.
  • Limit every presentation to one visualization per claim you are making. Remove any chart that does not directly support a stated conclusion.You become someone whose presentations are remembered for clarity, not volume, making your insights easier to act on and share.
  • Audit your last three charts. Replace any use of color that is decorative rather than encoding information, and switch to a colorblind-accessible palette (Tableau10, ColorBrewer) for all categorical data.You become a visualization practitioner whose choices serve the reader, not the software defaults, building charts that communicate accurately to every audience.
  • On every chart you create this week, add a text annotation directly on the chart that states the main takeaway in one sentence, so the conclusion is readable without the surrounding text.You become someone whose charts are self-contained communications, not puzzles that require a verbal explanation to decode.
  • For every analysis you complete, write a three-sentence brief: what question you answered, what you found, and what the business should do about it. Send it before the full report.You become the data scientist who senior leaders actually read, because you lead with the decision they need to make rather than the work you did to get there.
  • Volunteer to present your team's work at one cross-functional meeting per month where the majority of attendees are not data practitioners.You become comfortable translating technical work into business language under real conditions, developing the skill through practice rather than preparation alone.
  • After each major presentation or report, ask two stakeholders one specific question: what was the single most useful thing in this analysis and what was least useful.You become a communicator who improves through data, closing the loop between what you produce and what actually serves the people you are trying to help.
  • Identify the one question your team answers manually in a report every month, and build a self-service dashboard in Tableau, Looker, or Streamlit so the business can answer it themselves.You become a data scientist who eliminates recurring low-value requests from your own queue, freeing capacity for problems that actually require modeling.

Pillar 8: undefined

  • Subscribe to ArXiv Sanity, Papers With Code, or a domain-specific digest, and block 45 minutes every Friday to read and summarize one paper in a personal notes file.You become a practitioner connected to the research frontier, able to evaluate new methods on their actual merits rather than their marketing claims.
  • Once per quarter, pick a paper whose method you use but have never implemented, and build it from scratch in a notebook without using the library that wraps it.You become a data scientist who understands the tools you use at a fundamental level, building intuition that no tutorial can provide.
  • Enroll in and finish one online course per quarter that covers a topic outside your current comfort zone: causal inference, Bayesian methods, distributed computing, or domain-specific ML.You become someone who eliminates blind spots systematically rather than stumbling across them during a critical project.
  • Start one personal data science project per quarter using publicly available data, publish the code on GitHub, and write a short post describing what you learned.You become a practitioner with a body of work that demonstrates your capabilities independent of your employer, building career optionality through public output.
  • Register for one data science or applied statistics conference this year (NeurIPS, PyData, Strata, ODSC), attend at least five sessions outside your specialty, and write up three key takeaways.You become someone with a network and a perspective that extends beyond your current company and team, importing ideas that your organization has never encountered.
  • Work through all window functions, CTEs, lateral joins, and query execution plan analysis in your current database (Postgres, BigQuery, Snowflake) using real production queries.You become a data scientist who is never bottlenecked by data access, solving problems at the query layer that others wait for engineering to fix.
  • Read one foundational resource on causal inference this quarter (Hernandez and Robins, Cunningham's Causal Inference: The Mixtape, or Pearl's Primer) and apply one method to a real problem.You become a scientist who can distinguish correlation from causation in practice, giving your organization the confidence to act on findings rather than just observe them.
  • Pick a project you completed 30 days ago, re-read the code and analysis without opening the original context, and write down three things you would do differently today.You become a practitioner who learns from your own past work, accumulating craft over time rather than repeating the same patterns indefinitely.

Track Your Data Scientist Goals

Turn this framework into a daily habit with our free browser extension. See your 64-action grid every time you open a new tab.

Load this into my extension →

Or get the framework delivered to your inbox:

Related Goal Frameworks