Can data be used to harm other people? The reality is that in some parts of the world, it is already being used to trial increased surveillance and fuel oppressive social systems. This is a clear indication that technology will only ever be a reflection of the human hands responsible for its creation and maintenance.
In this episode, Alexander McCaig and Jason Rigby discuss the social implications of China’s authoritarian hold on its citizens—particularly its military-industrial complex’s creation of a three-billion US dollar supercomputer satellite center in the country’s Wenchang Spacecraft Launch Site.
Despite the official statement claiming that this enormous tech site will be used as part of a “massive constellation of commercial satellites” that can “offer services from high-speed internet for aircraft to tracking coal shipments,” we think it pays to be a little critical of the difference between what’s being reported—and what’s being done.
China has been the subject of widespread criticism after it announced the development of a social credit system in 2011. This system builds on a mass surveillance structure of more than 770 million cameras installed across the country as of 2019, with expectations that it will eventually hit the one billion mark by the end of 2021.
The social credit system is used to score individuals and companies based on a collection of data from different sources. Individuals are rewarded for appropriate social behavior, such as proper conduct on mass transportation systems and adherence to waste sorting rules in their city.
Conversely, they will be punished for “negative behavior” which can include elderly residents suing family members for not visiting regularly (Shanghai); cheating in online video games (Suzhou), failing to show for reservations at hotels or restaurants (Suzhou); and failing to pick up any take-out food that was ordered (Suzhou).
Individuals with poor credit score will face restrictions on loans, transportation, and even education. As part of the system’s effort to encourage good behavior, some local governments offer incentives to those with a higher credit score. These people will be prioritized in health care provision, and can even waive deposits to rent public housing.
Businesses are required to submit data on their operations and on their partners and suppliers. Their credit score can be influenced by their behavior and ratings from their suppliers.
Finally, individuals and businesses that are deemed “untrustworthy” will be publicly named and shamed.
A society where human behavior is closely controlled and dictated by the state, through the latest technological capabilities, sounds like the plot of a dystopian sci-fi novel like George Orwell’s 1984—but this is already the reality for more than a billion individuals and 28 million companies in China.
What are the implications of the social credit system? Critics are quick to point out that the government is using incomplete or inaccurate data to determine the provision of social privileges, and sometimes rights, for their own citizens.
The implementation of a stable credit system is also dependent on the strength of basic services, such as regulation in the credit industry and data protection. Those with lose credit scores may find it difficult to continue to progress in society, particularly if there is no concrete policy that can support their rehabilitation or reintegration.
Lastly, in a world where we have yet to fully account for all the factors that contribute to how and why we make decisions, this could easily turn into a system that disproportionately punishes people and businesses who are already struggling.
Beyond 1984, the hit television series Black Mirror also showed a glimpse into a society where people are controlled by their data. One of their episodes, entitled Nosedive, draw viewers into a world where everyone’s social status is controlled by the quality of their ratings on social media. A series of unfortunate events that are outside of the protagonist’s control have a massive impact on her credit score, which in turn cripples her socioeconomic status.
These important pieces of media and literature highlight how the agnosticism of data, alongside the impact of human intervention, can drastically change the impact of technology. It is our responsibility to ensure that AI is developed with a conscientious hand, and that it is capable of empowering minorities instead of widening the inequality gap.
Humanity’s thirst for innovating new and exciting ways of harnessing technology compels us to participate in a shared initiative: one that will help preserve our free will, personal autonomy, and human rights.
The TARTLE platform is our life’s work toward ensuring that your personal data remains personal. Everything you share is given with your full consent, and we help you connect to other like-minded individuals and organizations who can represent your interests.
Netflix is one of the biggest names of the Digital Age. It went from being a new way to rent movies, to a streaming service aggregating everyone else’s movies and TV shows, to being a powerhouse creator of original content in its own right. Oh, and it killed a one-time giant Blockbuster along the way and helped spawn a whole new branch of the entertainment world. However, it’s fair to ask if Netflix is really about making movies and TV shows. Is that really their main concern? The answer of course is ‘no’. The company is primarily about making money and for that, they need subscribers and in order to get the largest number of subscribers possible, they make use of a lot of data.
Naturally, they start with what they hope you want to see and basically spam your feed with a bunch of generically popular content. Overtime, they will try to narrow it down. How do they do that? They make use of algorithms to gather information on what you watch. They also pay attention not just to what you click on, that is both too simple and not terribly informative. How many times have you clicked on a movie only to get about ten minutes in and decide you don’t want to watch it? The algorithms pay attention to that as well. What you watch, how long you watch it for, when you watch it, and all of that goes into the algorithm. From there, Netflix’s hope is that they will be able to find similar movies and put them in your recommended feed. Sounds simple doesn’t it? Yes and no.
For one, there are holes in this system. Not just the occasional recommendation you would never plan on watching but major problems that can break the algorithm all together. Say you have roommates and you all share an account but also have very different tastes in movies? Or you have kids. Chances are you watch different things when they are around. That of course is what the profiles on Netflix and every other streaming service are all about. If they can break out each person individually, the algorithm has a chance to work. However, how many people really bother with the different profiles? I’m guessing it’s not as many as Netflix would like. Also, what if you have a busy schedule and rarely have time to watch a full two hour movie? You only have fifteen minutes here, twenty minutes there and typically work on something while the movie is on. So it might take a whole week to watch one movie. That kind of person likely wrecks the algorithm entirely.
Not to mention, how long does it take to build up a worthwhile profile of a given subscriber? One week? A month? A year? It will vary from person to person based on how much they watch, meaning how effective the recommendations are will vary a lot from one subscriber to the next. In short, Netflix’s algorithms are extremely inefficient in a variety of circumstances, and that means they are wasting time and money building user profiles that don’t work.
What should they do then? What would be a more efficient means of building those user profiles? Netflix could work with us at TARTLE. They could go directly to subscribers and ask what it is that they would like to watch. Who their favorite actors and directors are. When do they watch? Do they prefer movies or series? Do they like their series dumped all at once or would they prefer a weekly schedule? Netflix could talk directly to its subscribers and get feedback directly from them and so build a far more accurate profile than any other method. This would be faster and cheaper and in the end far more financially rewarding. Which means they could spend that extra time and money making better content to draw in more subscribers.
What’s your data worth? www.tartle.co
In line with TARTLE’s mission to promote climate stability, one of its Big 7, James and Alex welcomed Suzanne Simard to the podcast.
Suzanne, who is a Professor of Forest Ecology at the University of British Columbia and the author of Finding the Mother Tree: Discovering The Wisdom of the Forest, explores the significance of emphasizing data-driven action on climate change — particularly on the old growth forests of British Columbia.
She offers an eye-opening perspective on the deterioration of these old growth forests and the colossal amount of environmental data with untapped potential. Throughout their discussion, she also drew plenty of thoughtful parallels between big data and the fight against deforestation.
The complex data networks making up a bulk of the tech systems we are heavily dependent on today are eerily similar to the interactions of trees in old growth forests.
Suzanne realized that trees are in constant conversation underground. This is made possible with the help of sophisticated fungal networks that link one tree to the rest of the forest. However, this large-scale communication network is displaced when the old forests are cut down — and even when plantations are created, the network “goes silent for a little while.”
Even when the forest begins to rebuild, it would take decades — if not centuries — for these areas to regain the same complexity that they once had when they interacted as a society of trees in an old growth forest.
Beyond the impact of this loss to local biodiversity, there is much to be said about how clearcutting these old growth forests is akin to cutting off entire societies from communicating with one another. This, inherently, is an injustice to our environment and a setback at our attempt to become true stewards of the earth.
Suzanne introduced the importance of selective harvesting, a regenerative method that allows trees to grow back without trouble. While this is the best step forward, most companies in status quo prefer to clearcut entire forests because of the reduced cost.
One particularly harmful practice of clearcutting is the harvest of “mother trees” — big, old trees that are both the most ecologically valuable in the forest and the most profitable.
When corporations use clearcutting to profit from forests, they set back the local environment in five distinct ways. The first is the loss of biodiversity; the second is the loss of carbon, an element that’s important for sustaining life; next is a rise in water levels, a change in soil temperature, and an increased rate of decomposition.
For many people, it’s easier to focus on the problems that are directly in their sight instead of trying to grasp the bigger picture. Regretfully, this decision becomes a matter of survival in some situations: low-income families depend heavily on the sachet economy to get by, tech-challenged SMEs in rural areas still rely on paper documentation to keep track of their business, and the shift to renewable products can often be difficult because these items have a higher upfront cost.
We are challenged to think of the environment in two ways: first, to look beyond the concerns that plague our day to day activities; and second, to help others who are not as fortunate or as privileged as us get the access to look beyond as well.
There is a massive network of corporations, institutions, and individuals that enable the pace at which our climate is dying. It’s going to take a whole new level of mindfulness before we start changing how this works on a noticeable level — not just for ourselves and our loved ones, but for our communities as well.
Suzanne points out that humanity’s relationship with the environment has evolved significantly. On a continuum, foresting started off as an exploitative practice; but as we realized that we only had a finite amount of resources to work with, we made an attempt to regulate and then manage these harvests.
In the ‘80s and ‘90s, the US entered a period of science-based management. It was here, she explains, that the big leagues understood the connection between deforestation, climate change and big data.
But despite our progress, we have yet to reach a stage where we can accurately call ourselves stewards of the environment. This title calls for us to be proactive about the land and to hold ourselves accountable for climate change, not just as a present concern but also as a part of our intergenerational ethic.
This time, it’s not just a question of what your data is worth. How much is our collective data, as aspiring stewards of this planet, worth?
Data 1.0 to 4.0
Data and the way we use it has been evolving since the early days of the digital age. However, for most, that evolution has been slow, painfully slow. A recent article about Coinbase and its attempt to take the use of data to the next level illustrates, in some ways unintentionally, just how far we have to go. First, some background.
Data 1.0 is really just manually gathering data, processing and figuring out how to react to it. Think of it as customer surveys or going into your production records and manually entering that into spreadsheets so you can see it all and start to get some insight off that. If that sounds cumbersome – it is. Unfortunately, that is also where many companies are stuck these days. Perhaps the methods of gathering the data have gotten more sophisticated but it is still collecting a bunch of lagging data trying to plan the future off what was going on in the past.
Data 2.0 is more about using data and more advanced technology to automate a few basic functions, freeing people up to do more creative things. The processing of the data is faster but it is still lagging, leaving businesses to make their best guess about what the future will look like. Thanks to the improved processes, the guess is better, often good but it still is falling short of the potential that sound data management offers to any organization.
The article in question posits that Coinbase is leading the way to Data 3.0. Coinbase is a company that deals in the buying and selling of cryptocurrency. As such, it deals extensively with financial matters, the various crypto products they offer, and technology in the form not just of the software to allow customers to buy and sell crypto but of the blockchain technology that many cryptocurrencies make use of to ensure their ownership is secure and verified. Dealing with all of the different aspects of their business makes it imperative for Coinbase and any company operating in the digital world to improve their automation and decentralize their data so that everyone in the company can easily access it in order to streamline their operations. That’s what Data 3.0 is, it’s getting data out of the siloes we like to put it in and learning how to integrate it with multiple operations which helps keep the whole company on track. It also makes it much more flexible, nimble, better able to deal with the rapidly shifting digital environment.
However great Data 3.0 might sound, it is still dealing with the same information that Data 1.0 makes use of. In short, at the end of the day, Data 3.0 is still just a sophisticated 1.0. Yet, this constantly being in a rut despite the rapidly changing and improving technology available is only part of the story. We have to account for human nature as well. The fact is, we like to hear what we like to hear and people will often take the easy way out, manipulating data to get the information they think their bosses are looking for. That means leaders wind up making bad decisions based on worse data. This of course is a temptation at any level of data usage. However, it is even worse in today’s hypercompetitive environment. The constant pressure actually leads to shortcuts when in truth, having reliable data is more important than ever.
What is the solution? The solution is Data 4.0. With Data 4.0 you go straight to the source. You get your information not from algorithms, not by extrapolating from data skimmed off electronic interactions, but from actual people. This data is as close to real time as possible and is so specific that it becomes harder to manipulate. And who would want to? The whole point of going to the individual is to avoid all the middle men and the filters that can skew data in the first place. That is exactly what TARTLE hopes to do, create a Data 4.0 environment that will provide quality reliable data to help people make good decisions of genuine benefit to all.
What’s your data worth?
Jung, Stats, and You
“The statistical method shows the facts in the light of the ideal average, but that does not give us a picture of their empirical reality.” – Carl Jung
Pithy, isn’t it? Okay, it’s actually a rather dense quote. What it means is “stop putting people in buckets”. Thanks for coming to our TED talk and we hope you enjoy the day. Just kidding, let’s dig into this a bit.
First, isn’t it interesting how people can often spot problems early, long before the rest of us catch up. Typically, we ignore them and their concerns until it is years, sometimes decades later and someone else remembers the lost insight. That is the case here. That quote from the great psychologist is from 1957, decades before the digital revolution was underway, yet it is incredibly relevant to the present day. It is an indictment of our over reliance on statistics in our decision making processes.
Even the fact we tend to ignore insights like this, insights that are ahead of their time, proves the point of the quote. We ignore things like this based on an unconscious analysis that is grounded in statistics. Fifteen years ago, most people would have said, “I’ll never really ignore people in favor of my phone or an attractive spreadsheet.” Because a thing has never happened or has only happened rarely, that doesn’t mean it can’t or won’t happen. We hear this kind of thing in politics all the time. “No one has ever been elected with this….” Insert whatever statistical fact you want. And then it happens.
The truth is, statistics are great predictors, until they aren’t. Just because a thing usually happens in a certain way, there is no particular reason to think they will always go that way. What’s worse is that we think knowing some statistics is the same thing as really understanding something. We tend to treat them as explanatory, when they are only descriptive at best. There are many times when statistics aren’t even properly descriptive. Instead, they are illustrative of the analyst’s biases.
This is particularly true when applied to people. Imagine someone who gets a ton of ads for Christmas music. Why might that be? Because they often buy Christmas albums? Not necessarily. Remember, the algorithms that drive the ads operate my cross referencing certain behaviors. In this case, let’s imagine that this person with all the Christmas music ads tends to order a new ugly sweater on Amazon every year. The algorithm assumes that the person likes everything having to do with Christmas. Maybe this individual does like most things associated with the holiday. Everything but Christmas music. In fact, our sweater wearing friend hates Christmas music but endures it for the sake of the annual ugly sweater party with his friends. I can guarantee those ads are not going to convert him into a sale for the latest Mariah Carey Christmas album.
Why do we do this? Why do we make all of these guesses? Why rely so much on assumptions and allow our decisions to be guided by statistics and algorithms? Because it is easy. Find a few statistical correlations and develop an algorithm from them and then run all your data through that. Broadly speaking, the picture it forms may even be accurate. But you don’t really know for sure. You certainly don’t know where it falls short or why. The only way you really can be sure is by going to the individuals behind the statistics, the people actually generating the data that all these programs are trying to classify. Then ask them, “what were you thinking when you did ‘x’?” That’s how you get real knowledge, and real understanding, by treating data with the respect you give to the people who generate it. Because that data represents them and their thoughts. That is powerful and understanding is the first step on the path to real, truthful knowledge.
What’s your data worth?
Optipulse Pt. 2
Last time, we talked a bit about Mathis Shinnick and his latest company OptiPulse. We focused mostly on how the company’s new Near InfraRed (NIR) technology is being developed and how it will change the world of digital high speed communication when it is brought to market. Today, we are going to talk more about where Mathis would like to see the company and its technology and the kinds of investors who are helping make the vision a reality.
One of the most refreshing things about the company is the many grassroots investors that have helped get OptiPulse off the ground. Using an independent funding website called WeFunder (think of it as a Kickstarter analog for investors) has allowed people to get involved for as little as $100. Not only does this help decentralize the typical investing model of looking for a handful of high rollers, it can also help gauge what the demand for the product will be. If you have a lot of people investing for that minimum amount, it shows that there is a desire for what OptiPulse is offering. Even better, Mathis points out that the comments from these investors reflect something more than just a desire to get a return on their investment. The most frequent comment is that these people are eager to get OptiPulse into their own communities. These early adopters are able to see the potential being offered that will help get their own out of the way corners of the world better connected.
There are other uses for the technology as well. The line of sight NIR sensors have potential use for the self-driving cars that are getting close to hitting the market. Given the low cost, small size and low power consumption of the sensors it would be easy to have roads lined with sensors that communicate with other sensors in the car. Not only would this keep the car on the road, it would also let the car know when something was between it and the road. If another car, a bike, a dog, should step into the road and break the signal between the sensors, the car will instantly know it. If the road sensors are arranged correctly, they could even communicate with each other to let the vehicle’s computer know of hazards that are up ahead, or a fast approaching car on a side street, giving the car the ability to see beyond the line of sight. And again, given the low power consumption of OptiPulse’s NIR sensors and emitters it would be possible to power large numbers of them with a couple of solar panels.
The same technology can also be implemented for tracking information at remote installations such as oil and natural gas pipelines. A network of arrays could be used to transmit data constantly to service centers without the need of cables. Or, one could go with fewer arrays and fly a drone over the line to collect data and then transmit it back to the service centers. Again, no need for cables that need to get repaired whenever a squirrel decides to take a bite (yes, that happens).
As Mathis pointed out last time, it isn’t necessary that OptiPulse completely replace existing infrastructure either. Because of the vast amounts of bandwidth available in the near infrared part of the spectrum, OptiPulse can accommodate existing 4G and 5G technology and actually boost the performance of those devices.
With all of this potential, it is little wonder that OptiPulse has been able to attract a large number of investors eager to see the product brought to market. We’ll be waiting eagerly as Mathis and the others at OptiPulse work to bring their vision of a better connected world to life.
What’s your data worth?
Mathis Shinnick has been working with startups and investors for years. Most recently, he co-founded OptiPulse. Based in Albuquerque, New Mexico OptiPulse is working to revolutionize digital communications. They are developing Near Infra-Red technology that tests show are capable of data transmission speeds that leave 5G and even the much lauded Starlink in the dust. How fast? How about 10GB/sec? The potential is actually much greater but that is all current off-the-shelf electronics can handle.
Just as exciting as the speed, is the range that it allows. Photons towards the infrared part of the spectrum have a longer wavelength than radio or microwaves. Normally, that limits the range as longer, lower energy wavelengths can get obscured in the atmosphere. OptiPulse has patented technology that can focus the energy much more like a laser, giving it much greater range. If you are wondering how great the range is, it can send a beam to space with the kind of bandwidth mentioned above. Also, if you caught the part about low energy, that means you don’t need nearly as much power to operate the system, making it cheaper and greener to use.
That increased range plus lower cost will make OptiPulse the perfect choice for bringing broadband communications to out of the way areas. Well placed towers could provide communication for places that are difficult to reach with any kind of cable. Right away, that makes the OptiPulse system an obviously better alternative than fiber optics or any other option that relies on a hard and continuous infrastructure system. Naturally, this saves considerably on construction costs. There will still be costs of course. The detectors are line of sight, requiring the detectors to be in view of each other. While that means that a number of collectors and emitters are necessary, it also means that the data is more secure since it is harder to intercept a direct beam than something diffused over a wide area.
Another interesting benefit of this developing technology, Mathis points out, is that since it uses light to transmit information, it operates outside of any regulated space. OptiPulse therefore isn’t competing with all the cell towers and Starlink that are operating in the radio band. Those other means of communication have to deal with interference from other signals, signals that require devices to filter out the noise that results from the interference. Again, lower cost than other alternatives.
Mathis also says Optipulse will be easier to update. Since it is a modular system based around towers, towers that are accessible compared to cables in the ground or satellites in space, changing out hardware would be just a drive from the nearest service hub away. Therefore, as the communications technology develops, OptiPulse will be able to keep up with it much easier than anything else on the market or close to the market.
Yet, OptiPulse need not completely take over either. It could actually work with existing fiber optic technology. Remember, fiber optic is just using light to transmit information through a glass tube in the ground. Existing cable could be mated to an OptiPulse tower to extend the range of the network rather than having to incur the expense of laying new cable.
Where are things going in the future? The shift to online work that occurred as a result of Covid has brought a lot of awareness to the need for better connectivity, and to the fact that 5G isn’t delivering on its promise. Even in the relatively few places where it has been implemented, it is underperforming. That has helped OptiPulse attract a number of investors to help bring the company to the next phase, bringing the next phase of connectivity closer to you.
What’s your data worth?
Algae Goes Moo
Guess what, the Earth is only so big. Yes, it is big but it still has a finite size. That means there is only so much room for people and the resources necessary to support them. How much? There is a lot of disagreement on that but in principle, there is only so much space so there is definitely a limit. Which means it makes sense to spend time thinking about how we can make the best use of the resources we have until someone figures out how to efficiently terraform Mars.
One of the many resources we have that does have limited space is farmland. We’ve frankly done a great job in the last hundred years of figuring out how to get more and more out of less. Unfortunately, that has in part been through the use of growth hormones, fertilizers, and pesticides. While that has allowed us to get more food out of less land it has also had downstream effects on the environment that have been less than desirable. Yet, we don’t want people to starve. So, what do we do?
One of the biggest consumers of farmland isn’t actually people but cattle. There is a ton of farmland used to grow food for cattle. Combine that with what we use for people and then factor in the fact there is only so much fertile land to be used and it doesn’t take a genius to see that eventually we will run out. For one, there is the fact that there are only so many nutrients in the soil and if we don’t give farmland a break, it will eventually run out of them. That’s part of how the USA rose in power so quickly, it had farmland that was virtually untapped, allowing people there to grow more crops that were larger and more nutrient dense than what was possible in Europe where people had already been farming for centuries. Many cultures have understood this, which is why they developed the concept of crop rotation. Different crops use different nutrients so switching crops lets the soil build up the ones not currently being used. It’s also why the Mosaic Law directed the ancient Hebrews to periodically allow their fields to lie fallow. It gave time for plants and bugs to decay, animals to poop, and bugs to till the soil and rejuvenate it.
Now though, it is very hard to turn back to that system of farming, maybe impossible. Again, what do we do? One possible solution lies in another problem that we’ve created. Ironically, one that has been exacerbated by industrial farming – algae.
While fertilizers have contributed to algae blooms in the coastal areas of our oceans which have in turn darkened those oceans and could lead to disruptions of our ecosystems on that end (a subject we’ve gone more in depth on elsewhere), that very algae has a lot of nutrients. Nutrients that could be of use if we harvested more of them and used them to feed both people and cattle. Why not just people? Let’s be honest, Travolta was right in Pulp Fiction, “bacon is good”. And where would Sammy J be without his royale with cheese? People aren’t going to stop eating meat en masse no matter how much some might want that to happen. So it makes sense to find a more sustainable and environmentally friendly way to feed the cattle and so feed the people. Why not make use of this abundant resource and in so doing help solve another problem that we’ve inadvertently created?
For the record, this doesn’t mean we shouldn’t continue to pursue reforms to industrial farming. Those are sorely needed. Yet, it is this kind of outside the box thinking that will allow people to live in a way that is both sustainable and comfortable. TARTLE strives to promote and stimulate that kind of creative problem solving by encouraging people to share and donate their data with research organizations. That will help them pursue solutions to our most pressing problems with the best data available – yours.
What’s your data worth?
Talk Python to Me
Looks like we are on a bit of a roll with interviews over at TARTLE HQ. Recently, Alex and Jason had the chance to sit and talk with Michael Kennedy, founder of the Talk Python to Me podcast. If you are unfamiliar with it, Python is a programming language that is relatively easy to learn well enough for regular people to be able to do some cool things with it. Michael realized this years ago and started looking around for a podcast to tell the stories of people who learn to adapt Python to whatever field they are in to help answer some genuinely interesting questions. Unfortunately, back in 2015 no one was really doing that yet, so he had to start his own. Talk Python to Me is the result. Since then, Michael has been able to interview a wide variety of programmers and gotten some real insight into how to help people get more into the world of programming and data science.
One of the big effects of the rise of the Python programming language is the democratization of both coding and of data science. Because it is so easy to gain a working knowledge of, a vast number of people, from philosophers to economists have been able to use it to help in their given fields. Kennedy has noticed that many people who don’t consider themselves coders or software developers are making use of this language. It may be that Python’s greatest accomplishment that people who would normally never get into coding are making effective use of it.
Another insight that Michael talks about is that while educating people on the possibilities of both data science and coding it would be more to the point to say it comes down to inspiration. That is contrary to what the prevailing opinion was ten years ago. Back then, people were complaining that there weren’t nearly enough data scientists and we need to try to convince all kinds to be data scientists. However, that wasn’t working. They didn’t understand it and it just wasn’t interesting to them.
So, what does it mean to talk about inspiration instead of education? It means that instead of telling people how important things are, we need to actually show them. Demonstrate what data science can do and let people actually play around with it a bit instead of just cramming people’s heads full of information.
Also, don’t just shove data science down people’s throats. Instead, show them how it can benefit what they are already passionate about. If a high school student is passionate about volcanoes, show him how Python can help him better predict eruptions. If another is interested in tracing the evolution of language, show her how the programming language can be of use in showing how one language evolves into another. Doing that makes the concept of data science not just real, but interesting.
It isn’t just the sciences that benefit either. Guests on Michal’s podcast have included members of F1 and NASCAR racing teams. They’ve found that in switching to Python from Excel they’ve gained an edge on the race track.
Pointing out things like that makes it easier and more enticing to people to learn Python. Suddenly, they are able to really get into their own source data, maybe for the first time. That lets people save both time and money by avoiding third parties altogether. They no longer have to pay for some other service to analyze the data they’ve collected, or spend time sifting through reports on data someone else has collected. Now, a person can actually gather their own data and maintain control of it from beginning to end. Saving time and money while putting people in control of their own data? That’s something TARTLE can get behind.
What’s your data worth?
Interview with Shumin Luan
You know what we like at TARTLE? We like talking to cool and intelligent people who are doing cool and intelligent things out there in the world. Especially if they are doing those cool and intelligent things with data. That happens to be the case with Shumin Luan, a budding young data scientist at Boston College whom Alex and Jason recently had the chance to interview for TARTLEcast. During the conversation Shumin revealed that he has always been interested in data and the way it impacts people’s lives.
Even going back to when he was very young, he had a fascination with the world of finance. He saw how data was needed to properly operate in that world. Without it, investors might as well make their decisions by throwing darts at a board. However, that isn’t what got him thinking of moving more formally into the realm of data science.
That part of Shumin’s journey began when he was working in Dubai, UAE (United Arab Emirates) as an analyst in sales and logistics. While working in that role, he was able to see how data science could help make the company he was working for more efficient and aid in making better decisions. One of the areas that the data scientist saw for improvement came from the shipping division. Shumin was able to identify that there was a lot of inefficiency in the loading dock and shipping warehouse.
The warehouse was not organized to quickly bring orders up the front in the best of conditions. Throw in a rush or any sort of computer problem and operations could be significantly disrupted. Here, the truckers picking up the products represented a golden opportunity. That inefficiency in bringing product up from where it is located in the warehouse to trucks means that the truck drivers spent an inordinate amount of time just sitting around doing nothing. What’s more, the drivers are paid by the hour. The results have been a lot of time and money saved as new efficiencies have been put in place. How did a young data scientist manage to increase the efficiency of a major company operating out of one of the busiest places in the world?
One of the most important things that he did was to comb through the data and find that certain products were more likely to be sold and shipped together than others. Shumin simply recommended storing such items together and in a way that was readily accessible. That meant there was less time spent waiting around for the truck drivers and fewer stops had to be made since the biggest sellers were going together on the same truck from the same warehouse. Otherwise, one truck driver might have been making multiple stops just to fill up his truck. Now, everything could be efficiently stored in one or two warehouses so the drivers could get on the road to delivering products to the customers much more quickly. That means happier customers because they get their products faster and a more profitable company because they aren’t paying people to sit in trucks. It also has an environmental benefit in that with fewer trips getting made, there are fewer greenhouse gas emissions to be concerned with.
Shumin also briefly touched on one of the biggest challenges confronting data scientists today – the sheer amount of data available. One of the most important things such people have to do is sift through the mountains of information out there to find the valuable data that is needed in order to conduct a meaningful analysis. The good data is definitely out there, it just takes patience and skill to find it.
That’s where TARTLE comes in. Through our data marketplace we make it possible for researchers like Shumin to find the best data of all, data that comes right from the source. Instead of trying to sift through tons of third party data, we get right to the gold they are looking for, enabling them to make better and faster decisions.
What’s your data worth?