commit c19bcb36a21d750dcad19700093ad469b79b407b · davidgasquez.com/handbook

+2

Communication.md

··· 16 16 - Break up a giant nuanced block into sections. 17 17 - If something is critical, make it visual. 18 18 - **Beware [semantic diffusion](https://martinfowler.com/bliki/SemanticDiffusion.html)**. Popular terms lose their precise meaning as they spread through repeated misinterpretation. Complex ideas get simplified, then distorted, like a game of telephone. The more desirable a term sounds, the more likely it gets misappropriated. Keep pointing back to original definitions rather than abandoning diffused terms. 19 + - [Use plain language for complex concepts](https://www.scientificdiscovery.dev/p/salonis-guide-to-data-visualization) — most ideas can be understood by most people if explained clearly. But keep jargon when precision matters (e.g., distinguishing mean from median); in those cases, add a brief definition alongside the technical term. 20 + - [[Data Visualization]] can communicate faster than 1,000 words; [that power comes with responsibility](https://www.scientificdiscovery.dev/p/salonis-guide-to-data-visualization) — misleading charts spread just as easily as accurate ones. 19 21 - If you want an answer, you have to [[Asking Questions|ask a question]]. People typically have a lot to say, but they'll volunteer little. 20 22 21 23 ## Resources

+1 -1

Dashboards.md

··· 9 9 - The data will move 1 of three directions; up, down or stay the same. Ahead of time, what will the stakeholder do in each case? Are all three answers the same? 10 10 - Strive to create dashboards that are either standalone or include links to provide the relevant context. Without meaning, data is just digits. 11 11 - Ideally, push visualizations to the tools that have the context (e.g. a chart in a Slack message, a chart in a Jira ticket, a chart in a HubsPot, etc). 12 - - [Make your charts professional](https://bsky.app/profile/peck.phd/post/3lbfgjdnvy22k). 12 + - [Make your charts professional](https://bsky.app/profile/peck.phd/post/3lbfgjdnvy22k). [[Data Visualization]] is important. 13 13 - [Add all the possible context into the dashboard](https://www.youtube.com/watch?v=Kub2bXrKmOE): 14 14 - Instructions. 15 15 - Purpose and explanation of the data being shown.

+1 -16

Data Practices.md

··· 82 82 4. **What if I want to know more?** A **link to additional information** can be valuable for people who have time for more than a quick scan and want to understand how you developed the insight, or do some of their own related exploration. 83 83 5. **What if I have a question?** Explicitly **inviting questions** and responses is crucial. It's the best part of sharing an insight! This is where you get to learn about things your colleagues know that you don't, or what they're curious about but has not yet risen to the level of becoming a data request from them. 84 84 6. **What if posting this prompts a whole bunch of follow-up questions, or exposes incorrect assumptions?** If you have hit on something that's interesting to a lot of people there likely will be questions that spin off, new ways to slice the data you're looking at, or assumptions you have made that need to be corrected. 85 + - Apply [[Data Visualization]] principles so shared charts stay clear when they travel. 85 86 86 87 ### Slack Template 87 88 ··· 94 95 95 96 _Questions, concerns, ideas? Thread on!_ 🧵 96 97 ``` 97 - 98 - ## Charting Principles 99 - 100 - [Some principles to keep in mind when creating charts](https://www.eugenewei.com/blog/2017/11/13/remove-the-legend). 101 - 102 - - Don't include a legend; instead, label data series directly in the plot area. Usually labels to the right of the most recent data point are best. Some people argue that a legend is okay if you have more than one data series. My belief is that they're never needed on any well-constructed line graph. 103 - - Use thousands comma separators to make large figures easier to read 104 - - Related to that, never include more precision than is needed in data labels. For example, Excel often chooses two decimal places for currency formats, but most line graphs don't need that, and often you can round to 000's or millions to reduce data label size. If you're measuring figures in the billions and trillions, we don't need to see all those zeroes, in fact it makes it harder to read. 105 - - Format axis labels to match the format of the figures being measured; if it's US dollars, for example, format the labels as currency. 106 - - Look at the spacing of axis labels and increase the interval if they are too crowded. As Tufte counsels, always reduce non-data-ink as much as possible without losing communicative power. 107 - - Start your y-axis at zero (assuming you don't have negative values) 108 - - Try not to have too many data series; five to eight seems the usual limit, depending on how closely the lines cluster. On rare occasion, it's fine to exceed this; sometimes the sheer volume of data series is the point, to show a bunch of lines clustered. These are edge cases for a reason, however. 109 - - If you have too many data series, consider using small multiples if the situation warrants, for example if the y-axes can match in scale across all the multiples. 110 - - Include explanations for anomalous events directly on the graph; you may not always be there in person to explain your chart if it travels to other audiences. 111 - - Always note, usually below the graph, the source for the data. 112 - - Include targets for figures as asymptotes to help audiences see if you're on track to reach them.

+89

Data Visualization.md

··· 1 + # Data Visualization 2 + 3 + Charts can be more memorable, shareable, and quickly understood than written explanations. They help explore data, explain concepts, and [share information effectively](https://www.scientificdiscovery.dev/p/salonis-guide-to-data-visualization). Clear visuals strengthen [[Communication]] and [[Dashboards]]. 4 + 5 + ## Why Visualize Data 6 + 7 + - Visualizing data helps spot patterns, trends, and unusual data points that are hard to see in averages or summaries alone. A chart can reveal what an aggregate hides (e.g: [Anscombe's quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet)). 8 + - Diagrams can explain concepts faster than text. A few-second visual can replace a long, confused explanation. 9 + - A good chart communicates faster than 1,000 words, but that power comes with responsibility. Misleading charts spread just as easily as accurate ones. 10 + - Plotting data helps spot potential errors and artefacts before publishing. 11 + 12 + ## Chart Type Selection 13 + 14 + - Pick the chart type and [[Metrics|metric]] that answers the exact question; rates, counts, and shares reveal different truths, so show multiple small views when one cut feels incomplete. 15 + - Use familiar or practical units (minutes, not standard deviations) when possible. They're easier to interpret and sense-check. 16 + 17 + ## Clarity 18 + 19 + - Keep labels horizontal and close to the data. Direct labels beat legends. 20 + - Don't include a legend, instead, [label data series directly](https://www.eugenewei.com/blog/2017/11/13/remove-the-legend) in the plot area (usually to the right of the most recent data point). Exception: many categories referring to many elements (e.g., maps). 21 + - Use small multiples when too many lines overlap. Splitting into panels makes individual trends easier to follow, though it trades off direct comparison between entities. 22 + - Sort categories logically (inherent order) or alphabetically (easier to skim). 23 + - [Data looks better naked](https://www.darkhorseanalytics.com/blog/data-looks-better-naked). 24 + - Reduce non-data-ink as much as possible without losing communicative power. 25 + - Don't include more precision than needed. 26 + - Format axis labels to match the figures being measured (e.g., currency for dollars). 27 + - Look at axis label spacing and increase intervals if crowded. 28 + 29 + ## Color 30 + 31 + - Match colors to concepts (plants → green, bad → red) so readers aren't forced into a [Stroop test](https://en.wikipedia.org/wiki/Stroop_effect). 32 + - Use [color-blind friendly palettes](https://davidmathlogic.com/colorblind/). About 4-5% of the population has some form of color blindness. 33 + - Direct labeling also helps color-blind readers distinguish categories. 34 + 35 + ## Axes 36 + 37 + - Start your y-axis at zero (assuming no negative values). 38 + - Avoid deceptive scale tricks. 39 + - Leave breathing room on axes instead of extreme zoom. 40 + - The lowest point shouldn't appear to be the lowest possible value. 41 + - Pair relative effects with absolute numbers (or prediction intervals instead of confidence intervals) to show real-world risk. 42 + 43 + ## Context 44 + 45 + - Include explanations for anomalous events directly on the graph. 46 + - For unfamiliar chart types, guide readers with annotations. Add a mini-tutorial if needed. 47 + - Include targets as asymptotes to help audiences see if you're on track. 48 + - Make the chart standalone. Add purpose, units, timeframe, and source so it can travel without losing meaning and slot into [[Dashboards]] or memos without extra explanation. 49 + - Titles for graphs should be the conclusion or key takeaway. 50 + - Always note the data source below the graph. 51 + 52 + ## Reproducibility 53 + 54 + - Publish provenance with the chart (data source, assumptions, and ideally a link to code) so others can verify or reuse it and keep [[Data Practices]] consistent. 55 + - A chart with no source isn't much better than claiming a trend was revealed in a dream. 56 + 57 + ## Common Pitfalls 58 + 59 + - Skip arrows or other glyphs that imply trends you can't support. 60 + - Don't use 3D charts. They distract and make values harder to read. 61 + - Avoid confidence intervals when showing variability. They're often [misinterpreted as ranges](https://www.dangoldstein.com/papers/Hofman_Goldstein_Hullman_Visualizing_Uncertainty_Mislead_Scientific.pdf). Consider prediction intervals or underlying percentages instead. 62 + - Try not to have too many data series; 5-8 is the usual limit depending on clustering. 63 + 64 + ## Guiding Questions 65 + 66 + Ask yourself when creating a visualization: 67 + 68 + 1. Is my chart type meaningful for the question? 69 + 2. Can I make it clearer? 70 + 3. If complicated, can I guide the viewer through it? 71 + 4. Does the chart work as a standalone? 72 + 5. Is my chart's presentation justifiable? 73 + 6. Is my chart reproducible? 74 + 75 + ## Tools 76 + 77 + - [Datawrapper](https://www.datawrapper.de/). Quick interactive charts with great defaults. 78 + - [Raw Graphs](https://app.rawgraphs.io/). Open-source, unusual chart types. 79 + - [Observable Plot](https://observablehq.com/plot/). JavaScript-based exploratory charts. 80 + - [Kepler](https://kepler.gl/). Geospatial visualization. 81 + 82 + ## Resources 83 + 84 + - [Saloni's Guide to Data Visualization](https://www.scientificdiscovery.dev/p/salonis-guide-to-data-visualization) 85 + - [The Data Visualization Catalogue](https://datavizcatalogue.com/) and [Project](https://datavizproject.com/) 86 + - [Visualization Curriculum](https://jjallaire.github.io/visualization-curriculum/) 87 + - [Guides for Visualizing Reality](https://flowingdata.com/2020/06/01/guides-for-visualizing-reality/) 88 + - [Datawrapper's Do's and Don'ts](https://www.datawrapper.de/blog/category/datavis-dos-and-donts) 89 + - [The Science of Visual Data Communication: What Works](https://journals.sagepub.com/doi/10.1177/15291006211051956)