Python + UX Teams: Supporting Research and Innovation Smartly

Video size:

Abstract

Python skills aren’t just for engineers–it’s a powerful ally for UX teams, too. At Wheels, we’ve integrated python into our workflows to better support product development and innovation. This talk will discuss how the UX team uses python to enable better research, prototyping, and collaboration.

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi, everyone. Welcome to my presentation, Python and UX teams, smart ways to support research and innovation. I'm Liv McFarlane and I'm the principal UX architect at Wheels. So before getting started, I want to warn you that this is not a technical talk. I won't be showing much code and I won't be demonstrating how to implement anything. Instead, I'm going to take you through my journey of learning Python and applying it so I could alleviate some of the issues I and my team have faced since 2021. You'll hear me talk a lot about having big ideas, but then having to temper them with reality because I don't have enough time to build what I think should be built, Or because I can't get support to deploy my tools internally for others to use. Or because I have to build something that can be used by people whose level of skill with Python. Regardless, these concerns haven't diminished the value of Python to me or to my team. In fact, I think they reinforce just how critical having Python skills on a UX team like mine is. I would argue that Python is a critical tool because as the product development landscape evolves with advancements in AI and machine learning, teams like mine have to evaluate how they operate and evolve. And I think that, at least for UX professionals, that starts with looking at our tools and processes. Tools and processes are linked to an extent, right? So your processes, what you can do with others and for others, are impacted by your tools. It's more nuanced though than simply having more tools to do more things. Sometimes more is just more. A balance needs to be found between the number of tools a team has and the quality of those tools. At the risk of sounding cliched, tools should help us work smarter and not harder. We should expect to have to learn a tool, right? But we should not be overburdened by a tool. UX tools and processes generally cover two big buckets for my team, design and research. And while they generally feed back and forth between one another, they do have separate concerns. for example, on the design side, we need to move quickly to create prototypes that have to display large amounts of data. Or we have to consult internally to other groups about designs of products, leveraging dashboards. And then for research, we need to answer a variety of questions that range from the more tactical, such as are the steps for completing a task, easy to figure out to the more strategic like. How have attitudes been changing about our products and services? And it doesn't end with doing design and conducting research. We have to ensure that information we generate remains available to our partners across the org. And we have to do it in such a way that we, the UX team, are not a roadblock. We have to do it in a way that encourages collaboration with us. But does not infringe on the autonomy of our internal partner. Now, as you can probably tell from the way I'm describing things so far, there are a lot of tasks caught in these buckets. And Python is a language that enables me to do a lot of heavy lifting for my team and others. By me, just me. One person on the team learning Python, our UX team has been able to alleviate pain points during prototype creation, which has the great downstream effect of leading to faster prototype creation, create more realistic prototypes, at least in terms of processing and displaying data, support other teams participating in innovation and product development. Speed up data analysis for research and create more valuable insights by being able to perform a wider variety of analyses for qualitative and quantitative data. And all of these things combined have enabled us over the past several years to build new partnerships across the company with teams involved in the processes for product development, innovation, and research. in today's presentation, I will walk you through how our team applies Python based solutions for design, research, and collaboration. Some of the things I will mention have evolved since they were first created and used on my team. Some solutions have changed very little, and we have gone so far as to create our own guidebooks for standards of use that we treat as living documents and revise as needed. Other solutions are still in development and testing phase. now that I've introduced you to all my topics, I'll take us into the design portion of the presentation. And just to let you know, this will be a shorter part of the presentation. There are a lot of changes happening on the design side due to the inclusion of generative AI in some of the more common UX design tools, and this has impacted how I plan to continue working with Python for design activities. So for design, the key takeaway is that Python support for prototyping tools speeds up prototyping and design exploration within the UX team and with our internal partners. Now, before digging into this topic, I want to be very clear that when I speak about design, I'm speaking about prototyping and not finished designs that look like what you would find in production. This is because the largest gains we found for our team has been at this point in the design process. what's that gain? Time. And how do we get it? Fake data. many times we are creating prototypes that require anywhere between 50 to 300 data points to make realistic seeming interactions and reasonably realistic outputs. the data points that feed into these prototypes range from the very simple to the somewhat complex. And so just to give you some examples, on the simpler side, we need first names, last names, phone numbers, zipper postal codes, cities, states, and provinces. And then on the more complicated side, we need VIN numbers, license plates, company IDs. As an added complication, we have some restrictions for data points like VIN numbers, where we want to avoid accidentally creating a real VIN number. So prior to the introduction of Python, what did the process of getting data for our prototypes look like? it varied. Sometimes people would create about 10 unique data points that they roughly modeled off of client data in our systems, and then they would copy and paste those data points over and over, making changes to them as needed, until they had the number of data points that would let them accurately depict the interaction. Sometimes people would have a set of links for multiple websites they could go to, to create the data for them, such as VIN number generators and plate number generators. all in all, there's a lot of time and effort spent on making up data that seems, realistic enough, or a lot of time and effort spent jumping across multiple sites and concatenating blocks of information. And then doing cleanup to make sure it fits the rules for how certain data points look in our system. as you can imagine, this is not only time consuming, but it's error prone. just to give you a very quick example, say you need to prototype out a table with filtering, sorting, actions like creating, updating, and deleting. if you aren't careful when differentiating your data points, your filters will capture too many data points, or you might delete too many data points. So then you have to go back and you have to fix everything. You have to hunt down where the problem went wrong and then put in your corrections. Now, you may also be wondering at this point, how long would it take to make the data up on your own? on average, 30 minutes up to an hour, not counting the time you would spend having to test the data in the prototype and make adjustments if you found anything odd in your prototype's behaviors. so while this might not seem like a big sacrifice, given that this should only be a one time thing, there's a couple things about our team I would like to point out. First, we're not a large team, so it isn't unusual for people to be working on two or more projects at a time. I, as the principal, don't do that. I can be on four projects at a time. And then changes might be necessary. there's sometimes some moving around of interactions or data points that has to happen. how is this whole process solved? a little package named Data Generator, which I'm showing in a rather abstract manner on this slide. And you can see that it's a collection of Python scripts and text files. And as its name implies, Data Generator generates fake data that the user can download as either an Excel file or a PDF. they can then use that data in their prototypes. And the fake data includes all of our core data points. names, contact information, vehicle details, any unique IDs, membership category and subcategory IDs. The Excel file output is probably the most used output type on our team. And this is because we use both Acture and Figma as design tools. And that file type is the easiest to use with both of them. The first generation of data generator ran through a very minimal UI built using MLJAR Studio. however, allowing access became an issue. the UI portion was abandoned and it now runs from the command line or within a Jupyter notebook. that solved an issue for our team. But there are other teams involved in innovation and we're obligated to support. For some time now, there's been an effort to improve the quality of our data visualizations. Serve up new visualizations and build new data products. and since the teams aligned with these efforts often roll up into different departments from ours, the UX team found we had some blind spots. so we made time to listen to these teams, and after hearing their concerns, we took some time to figure out how best to support them. And this brought us again to our prototyping tools, and we found two things. first, Acture isn't always the easiest for creating higher fidelity interactive data visualizations, or depending on what you're doing, even static ones. It's and then Figma can produce high fidelity visualizations, but sometimes the interactivity is a little bit lacking. And this was a concern because, while typically prototyping shouldn't always be too tied up in being super detailed, when it comes to data products, higher fidelity designs and detailed interactions can make a difference during user testing between accurately understanding feedback and incorrectly interpreting feedback and actions. another finding was we don't have enough people on our team to always offer support beyond consultation sessions around implementing design standards. we have a style guide that anybody in the company can access, but it's not right to simply point a team to a set of guidelines and then abandon them. So it became a question of what, if any, tools exist that could solve our issues. one of the teams we work with happens to use Streamlet. if you're unfamiliar with Streamlet, it's an open source framework for building data apps using Python. again, and quite unexpectedly, Python came through as a way to devise a tool that saved the day for our team and another. Now, that's not to say that everything went smoothly right out of the gate. For starters, our design system components at that time were only built out in a low code development app So we had nothing in Streamlit. Furthermore, our partners didn't really have the time to take our standards and adapt them to their products. They also would not have access to the location where all of our components and assets live for our development teams. So that meant that whatever solution we did create for them would have to be one that lived in their little corner of the corporate world. Thankfully though, Streamlit is flexible enough that through a combination of CSS and Streamlit functions, we're able to recreate a substantial chunk of our style guide, complete with code snippets and examples of how to use them for our partner team. And they can then host that app on one of their servers. let me show you a little bit of work from the earliest implementation of what we made for them. on this slide is our page laying out colors. for light mode. And you can see we've provided color chips, the CSS variable name and the actual hex color code. And we do the same thing for the dark mode if you were to click on that tab. and I do want to mention at this point that as part of this effort, our team supplied our colleagues with a style sheet pointing to all the web fonts they would need and with the light mode and dark mode colors all set up for them. And on this slide, you see a sample from our icon library complete with code snippets that can be adapted as well as the file path to use. And so I will add that the file path is based on how we recommend other teams to set up their images directory. Again, our colleagues in this instance aren't connected to where we have our assets for the broader development group, so we provided them with the necessary SVG files to start and then gave a recommendation on how to access them. And, in case you're curious, I know it's a bit hard to see, but on the left of this slide is what the code looks like for the Streamlit Icon Library page. And you'll notice at the top two lines, Import StreamlitComponents. v1 as Components, and From Utils Import LoadCSS, Generate Icon Table, and Generate Footer. I'm calling all of this out because it's a great capability when building out this style guide app. in Streamlit, you can write Python functions to generate chunks of HTML that will then be treated like components. if you look at line 54 in the image on the left, you can see the call to generate icon table that will display each of the icons in the engine icons folder. Then in the image on the right, you can see the actual generate icon table function. I do want to stress though that recreating our entire style guide in this way is not possible. There are multiple components in our design system that have to be created from scratch for use in Streamlit due to styling choices that were made before we knew Streamlit even needed to be on our radar. Some of those components include buttons, radio buttons, tabs, input fields, notifications, and modals. it's a lot. But the bright side here is that Streamlit has tutorials and templates for creating custom components. So as time permits, These components are being built using a React template. And to date, only buttons and input fields have been completed. But we expect to have more time this year to work on some of the other components. Now, before wrapping up this section on design, I want to state that I expect our use of Python within design activities to increase, despite the growing presence of generative AI in design tools. And this will be due in part to that growing presence, but also to the types of products and services we will need to design and then test. So for example, I expect that Streamlit will actually be used to prototype out dashboards and reports as we build new data products because more so than our other tools. It can do what we need very well. So next in this presentation, I will talk about our use of Python in the UX research process. And research has really been where the use of Python has shined for our team. So if I were to identify the major point of this section, I would say Python expands our research toolkit so we can provide deeper insights and more widely apply our findings. Now, before I jump into telling you how Python has improved our research practice, let me take a moment to quickly tell you about the types of research our UX team deals in so you can understand why we use Python like we do. one way to categorize our research activities is as generative or evaluative. Generative research creates a deep understanding of user needs and motivations to uncover problems for which we can then build solutions. And evaluative research evaluates a solution, like a prototype product or service, with respect to a problem that's been previously identified. another way to think about UX research is to view it as either qualitative or quantitative. And this is probably the way most people are familiar with thinking of UX research, right? qualitative UX research focuses on the perceptions, behaviors, and emotions of experiences. And quantitative UX research focuses on measuring aspects of experiences, both subject and subjective and objective. And based on these descriptions, you can probably tell that depending on what you want to know, you can lean more heavily on qualitative methods or quantitative methods, or you can even combine them. And so the big takeaway from these two ways of classifying UX research is that they can be deployed in a tactical manner or a strategic manner with the objective that these two approaches will play off of one another during innovation and product development. And so now, this is where I bring up a topic that has appeared repeatedly in this presentation. The extent to which you can do things. in this case, conducting research and blending tactical and strategic research activities depends partly on your tools. obviously, the better your tools, the better insights you can turn out. For us, having better tools means multiple things. First, we can move faster. Then, we can do a wider variety of analyses. We can communicate our findings better, and we can expand the data sources we use. And a really good example of how some of these things fit together is in the analysis of qualitative data. And that's actually the bulk of our data, qualitative feedback. And so we get this data from user testing, interviews, and surveys. And mostly we're interested in identifying topics or themes and suggestions for products and services. so typically the process for uncovering what we're interested in is rather slow and evolved. there's a lot of reading and a lot of manually applying tags to observations in the data, and occasionally reaching out to teammates to discuss how you have interpreted and tagged the data. observations. So again, it's a rather involved process, and that can easily take a week or longer depending on how much data you have, the tools available to you for the work, and how big the team is you're working with on the research project. So from my perspective, this domain had a lot of opportunity not only for speeding things up, but also for experimenting with what information we could extract. And on that last point, incorporating Python into the analysis process enabled us to start looking at sentiment and emotion, as well as introduce topic modeling and applying new data visualizations. you might be unsurprised to learn though, all of it started out pretty slow. But over time, the values really added up as I've included new libraries and tried out new approaches to see what fits best for our needs. And so this slide lays out the progression of how we started incorporating Python as a tool for qualitative data analysis. So going all the way back to 2021, I kicked off the effort with sentiment analysis and then expanded it rather quickly to include some basic text metrics. And this is because it's a fairly common question to ask how the features of text change with sentiments. So for example, we might want to know whether the length of comments measured in word counts changes with valence. and then for the second phase, and that was, I would say late 2021 into earlier, mid 2022, I began experimenting with different types of topic modeling and how to present the information from that work. And the third phase, which I would say is ongoing, has expanded to include using pre-trained models from hugging phase as well as some of the features of Spacey. And it's really this phase which has enabled our team to provide deeper insights to our partners across the organization. As we can now blend activities from all phases to explore links between topics, sentiment, and emotion, and see how the various text features we track relate to those things. and spaCy in particular has improved how we analyze text data, as it's rather easy to create domain specific NER tags. So now we can quickly identify where certain features and products are called out in the comments, and then also link those back with topics and sentiments. Just as an example, another area where we had a strong gain was in using LLMs and some very basic prompt engineering. And here we started with scikit llm and OpenAI. It was really just a test to see how well an LLM would perform at few shot classification and zero shot classification activities without fine tuning. And the results were actually pretty solid, both after I and another person on the team reviewed the output. as a team, we went forward with developing guidelines for using LLMs as analysis tools. And at a high level, those guidelines mentioned things like always pair LLMs with a UXer, so somebody with domain knowledge and project knowledge can evaluate the output, catch issues early, and change prompt writing strategy if needed. clarity and specificity of writing prompts for analysis, which was developed over time from our personal experience and from recommendations released by reputable sources. And then finally, code snippets to use for prompts. as great as this was, concerns mounted pretty quickly with the safety of not using local LLMs. we eventually shifted over to using Olama. Now, OLAMA, if you're not familiar with it, is a way for you to download LLMs and run them locally. with this change to OLAMA, we shifted from using OpenAI models to Metas LLAMA. And there hasn't been any downgrade in terms of, the quality of results we receive for the aforementioned classification strategies. Also, the shift to OLAMA brought with it experimentation with LangChain, and then, scikit olama. And one thing driving this shift is that while the use of LLMs and prompt engineering benefits my team, my team does need to be able to use the tools. Not everybody on my team wants to learn Python or has the time to do it. So in that regard, scikit olama is a bit more friendly for people who aren't really familiar or comfortable with Python. Another thing is, it's very hard for me to create a UI and then have it stored somewhere internally for my teammates to use. There's just not a lot of support for that from IT. So I have to find the simplest ways to do things with lots of plug and play code snippets. Now, that doesn't mean I abandoned the explorations of using line chain with LLMs. It has served as a key component for some experiments that I'll talk about later in the collaboration section of this presentation. So now, I'd like to talk a little bit about how our approach to quantitative data analysis has changed. And our main sources of quantitative data include surveys, ratings from user tests and quantification of aspects of qualitative data. we have some web analytics, but the vendor we use doesn't give us much of the raw data. They aggregate it for us, and they make it very difficult for us to go back and get the raw data. for my purposes, it's useless. So until all that's resolved, it's not really a great source of quantitative data for us. But the quickest gains we got from using Python were with exploratory data analysis, data visualizations, and basic methods like linear regression logistic. So at this point, this is early 2021, and we had one survey that had roughly 3, 000 data points. So it's not a lot, but you can still do something with 3, 000 data points. And we combined. This survey analysis with sentiment analysis using VADR. And so for our first attempt, our effort was actually pretty well received. So you might be wondering then, what won the day for us with just this one attempt? previously analysis was done using Excel, which, no shade on Excel. You can produce legitimate analyses with it. But with Python, we were able to produce better and more varied visualizations. Finer detail showing how responses on one variable related to another and then explorations of links between ratings and sentiments. So that one survey got our team more credibility and enabled my boss and I to keep pushing for more data. we found other groups conducting surveys or collecting data on interactions with our customers, and we just simply asked them for their raw data so we could devise new hypotheses and try out new analyses. So now in 2025, as you can see from the timeline on this slide, the use of Python as a tool for data analysis has grown well beyond exploratory data analysis, visualizations, and regressions. And that's not to say we don't still use all of that. We do. But where we have large enough data sets to support such work, we can perform cluster analyses, association mining, and we've even done network analysis. And so these methods have all helped create new ways of thinking about our clients that go beyond things like, industry effects, fleet size effects, or even fleet composition effects. And they've also helped us understand things like how attitudes toward one set of services may impact attitudes towards another set. And so these are all insights that had I not made the effort to learn Python and apply it, our team likely would not have been able to provide them ever. but now, because of it, we have teams even outside of product development and innovation functions who take an interest in this work that we do. Now, while it's great that we're able to provide such value using data provided by internal sources, a key part of innovation is looking outside your world, right? Going externally for useful data. And Python does a great job of enabling that. So let me offer you a very early example with context, and I'll share that my employer is a fleet management firm, and as such, we have an interest in understanding attitudes toward electric vehicles. So in the spring of 2022, using TWINT, Twitter intelligence tool, I was able to scrape several thousand tweets about EVs for topic modeling and text analytics. Now, while it's still possible to get some data from X, we've since moved on to other more reliable sources when we want to explore specific topics. And the internet is a very large place and tools like Beautiful Soup and Selenium have really helped us gather what we need. Now this next portion of my presentation, collaboration, covers some of the recent and in progress experimentations with Python happening on my team. And the broader point from this section is, Python can help provide better access to information which facilitates collaboration between UX and our internal partners. Now, if you remember during the research portion of my presentation, I mentioned that I was still exploring how to use Lang Chain with LLMs, and one idea we looked at in 2023, but deprioritized was the use of Lang chain's, SQL chain and a GPT model with a post gray SQL database to interact with data. and it started with simple things like retrieving raw data, generating counts, getting averages. very simple. And there were a few reasons behind the data. Idea, rather. first, we have some data from studies before I joined the team. but it's all over the place, and so trying to find that data, if you weren't involved on the study, It can be a bit of an adventure, let's just put it that way. plus we're always generating new data. So this activity would require getting all of that data organized and stored so that it could be interacted with by us or by others across the organization. second, not everybody on our team wants to dig into data. we are a mixed team of practitioners where some people like me are more geared toward research and analytical processes while others strictly prefer design activities. So I have to ensure that everyone is supported regardless of their comfort level and preference for activities. And third, sometimes others in the organization want to see our data. the intent behind the proof of concept was well received, but there were questions about interacting with it. the big one being, How do we control the prompts given to the tool so that the user will get back the specific data they want with the correct SQL queries applied? this ultimately led into a broader discussion about database design and maintaining data integrity that, frankly, our team is not equipped to handle at the moment given our size and project loads. for these reasons, this idea has been de prioritized. Doesn't mean we gave up on it, we're just not focusing a lot of effort on it. however We still do have an issue with others in the organization wanting to see our past reports, or even look at relevant UX or HCI research from peer reviewed publications. And while this isn't the exact same problem as using an agent to interact with data in the database, it's not entirely dissimilar, right? It's still a user feeding prompts to an LLM that then goes to retrieve information stored in a database. And in general, this is a very well documented scenario, right? there are online tutorials you can read on Medium or watch on YouTube. So for the initial phase of this work, I chose to pair ChromaDB with LangChain and Olama. While there are better options than Chroma out there, I have some constraints, like I have to move quickly, I can't incur additional costs, and there's nobody in IT to help me. So knowing all of that, Chroma seemed to be the best option for me. Then I had the choice of testing this idea out with different sets of documents. So one choice was to use our internal reports. And these are templated documents with consistent layouts and headers where the only real difference is the actual reported content. And with everything being so similar, there's not a lot of challenge here. The other choice, though, was to use a subset of PDFs our team uses like a mini library. And this choice is where things got interesting because these PDFs aren't just journal articles. some of them are industry reports, others are design toolkits. So there are pockets of consistency, but for the most part, there's not a lot of it. And then the next thing to consider is the knowledge level of who would be using a tool like this. So regardless, if we're storing UX reports and UX or HCI publications, the people on my team are the ones. who know all the jargon and how we internally organize documents, while our partners in product management, for example, might not know all that same stuff. So there has to be a way to store each PDF with some information that could help the person interacting with the tool. And so as a team, we thought that information could include things like title, authors, publication dates, summaries and topic lists. Now, the problem is we already have over 100 PDFs from external sources alone. So assigning someone to go through each article and generate this information is not a good idea. and considering that we will add more documents in the future, it's just even more unrealistic to ask someone to do that. But as our team has already discovered though, LLMs with appropriate prompt engineering can substantially reduce this kind of workload. the tool in this particular phase allowed someone to submit a document, extract and generate key pieces of information from that document, and then store the information along with the document. And this current slide shows the prompt used for testing the extraction and generation activities. you can see we assign a system role as a helpful research assistant and a user role as the entity that would actually use that output from the helpful research assistant. And we use a formatted string to present the request for a summary, up to five topics, the title, authors, and publication date for a particular document. And this slide shows the code for accessing and printing the LLMs output and the output itself. as you can see, it's fairly coherent, right? but depending on the contents of the document, the output quality dropped. But from what I noticed, that seemed to almost always be for documents that relied heavily on visuals and less on text. And so that would be more for things like design toolkit documents, as opposed to research articles or industry reports. Those last two would always turn out fairly coherent outputs. So the next phase in this work that we're hoping to make progress on this year. looks at using LLAMA Index. So if you're not familiar with LLAMA Index, it's a framework for connecting LLMs and data sources. So basically, if you need a RAG model for your application, you may wind up using LLAMA Index. And that's the next stage, really, transforming this into a RAG application. And for me, here, the question is, What type of RAG model to apply here? Graph RAG is probably not appropriate here, but maybe RAG with re ranking is. So there's going to be some time on my part to do a little bit deeper dives into the different RAG models and to talk to some people in my company and get some feedback and then just see what works. Try it out, see what happens. Now, we're almost at the end of the presentation, but before we wrap everything up, I would like to share with you four things I think you should keep in mind if you ever find yourself in a situation like mine. So first, find the problems to solve that are repetitive, error prone, or mentally draining. You would be surprised what alleviating tasks like that can do for yourself or for someone else's well being on a project. Second, if others will be using your solution, don't assume they can do what you can do or even that they want to. and this is something I think it's pretty innate to UXers, just because of, The problems we have to solve, but, helping others sometimes means meeting them on their level. like I said, while it's common for UXers to have that viewpoint, I think in general, it could extend well to all creative and building. third, know when to step back from an idea. So an idea might seem good or sound interesting, but it may not meet the greatest need, or it may take way longer to implement than the time you have to spare. And if you have to step back, there's nothing wrong with doing this. Sometimes it's just a sign that now is not the right time for your idea. And then finally, don't let a lack of support stop you from building something out. Sometimes you won't get the support you need to bring the best version of your idea to life, and that's okay. Just adapt it within the constraints you face if you can. and show that off to others. And when people see the value in it, use them and their excitement to help you get the support you need to make improvements for your vision. So with that, we've reached the end of my presentation. If you've stuck with me to the end, I want you to know that I appreciate you. So thank you for spending your time with me today. Bye.

Slides

Download slides (PDF)

See all 53 talks at this event!

Conf42 Python 2025 - Online

February 06 2025 - premiere 5PM GMT

Python + UX Teams: Supporting Research and Innovation Smartly

Video size:

Abstract

Summary

Transcript

Slides

Liv McFarlin

Principal UX Architect @ Wheels

Join the community!

Featured event

2026

2025

Info

Conf42 Python 2025 - Online

February 06 2025 - premiere 5PM GMT

Python + UX Teams: Supporting Research and Innovation Smartly

Video size:

Abstract

Summary

Transcript

Slides

Liv McFarlin

Principal UX Architect @ Wheels

Join the community!