I used to work for a company that built satellite receivers that would be installed in all sorts of weird remote environments in order to pull radio or tv from satellite and rebroadcast locally.
If we pushed a broken update it might mean someone from the radio company would have to make a trip to go pull the device and send it to us physically.
Our upgrader did not run as root, but one time we had to move a file as root.. so I had to figure out a way to exploit our machine reliably from a local user, gain root, and move the file out of the way. We'd then deploy this over the satellite head end and N remote units would receive and run the upgrade autonomously. Fun stuff.
Turns out we had a separate process running that listened on a local socket and would run any command it received as root. Nobody remembered building or releasing it but it made my work quick.
It's crazy to me that this is possible in the first place. Standard practice is to have a fleet of test vehicles that are effectively production except in an early release group.
Or, you know, having an A/B boot partition scheme with a watchdog. Things that have been around for decades at this point.
Disclaimer: Former Googler, Worked closely with Automotive.
It's easy to underestimate how hard and expensive it is to build, deploy, and remotely upgrade software that runs reliably on a fleet of diverse cars (different models, different years, slightly different components from batch to batch, etc.). It makes updating a mobile phone OS look trivial in comparison.
So far, only Tesla seems to be able to update car software remotely, regularly and reliably. I'm certain it's neither easy nor cheap.
All things considered, physical buttons and dials are probably easier and cheaper, because they don't require software updates!
Bringing CI/CD mindset to cars is probably not a great idea. Software updates to commuter vehicles should have a high bar for operational standards, and a simple thing such as an expired certificate should have never been deployed. Having isolated networks in vehicles helps but doesn't prevent broken updates from, eventually, bricking the cars.
Interesting to note that Ford's approach of updating software is far more conservative and car-like. It can be done fully offline via USB, but requests that you kindly upload the log files written to the memory stick back to them when complete, in the instructions as a necessary step. Presumably so they can track and stop incidents like this before they happen fleet-wide.
Rivian seems more like a "ship it and we'll fix it in the next sprint!" company.
How do other manufacturers handle updates?
When will humans be crazy enough to update the firmware of artificial hearts OTA?
Updating cars with new features OTA, even "just" an Infotainment can possibly cost lives, because the driver might get confused and isn't putting eyes on the streets.
It should be forbidden and every change should be made clear to the driver, shown in detail, and should need verification twice before being accepted. There must not be any kind of surprise in a car for the driver.
It should even be possible to skip an update or stop updating at all.
What a nightmare. This is where software engineering meets "real" engineering, where a "bug" has potentially life threatening consequences.
Is it possible, as a licensee of the Rivian vehicle system, to disable the automatic OTA updates without having expert-level knowledge or tooling?
Also, yes, I'm specifically avoiding using the word "owner" above for obvious reasons.
Stuff like this is why I don't want OTA updates in my cars. Let the car dealership deal with it during regular maintenance. They'll be on the hook for fixing it before handing the car back to me.
This is why I don't really want my car to have any antenna (that receives/interprets code) or receive OTA updates, ever.
I'd like to please force any attackers to at least be within 50 feet of my TPMS, instead of being literally anywhere on the planet.
A car doesn't need data updates, and definitely not code updates[1]
1. source: every car built in previous century.
This is a bit of a nightmare scenario and why when remote updating, you always test update to your own fleet first. Always.
Move fast and break things that move fast…
I don’t really like or trust most (if not all) of the established automakers, but there is something to be said for having several decades (over a century in some cases) of experience building potential killing machines vs. a company that’s not even 15 years old. The established players have put out cars which suffered freak malfunctions, but Rivian (and Tesla) seem to be struggling more with QA.
Non-rhetorical question: do companies have safeguards for critical components like braking systems, or are they also prone to catastrophic failure if a software engineer pushes a bad commit?
This is why I have a Dumbcar connected to a Smartphone via bluetooth.
I have preorders in for the R1S, the Volvo EX90, and the Kia EV9. I passed once already on buying the R1S when they had one in town available for immediate purchase, simply because they refuse to adopt CarPlay.
This incident does NOT give me confidence that Rivian is likely to offer a better alternative to CarPlay, despite their statements otherwise.
I suspect the EX90 will be what I land on eventually.
Whomever makes the first affordable, tight tolerance electric car that doesn’t spy on you and doesn’t need special care will win the market
This is actually a topic that I think about from time to time: how to do aggressive changes to software while they are running. In Ruby world you have monkeypatching. And Linux kernel has livepatching.
For example, if you have a distributed system and you want to upgrade a component that every caller uses: you have a large exercise on your hands where you might have to roll out a change over time and then clean up your incremental branches where you have to handle two control flow paths through the code. It reminds me of Google's protobuf required field discussions.
It reminds me of repository-per-microservice and a Java library that other microservices use and updating a dependency and having to deploy the change to every service.
It's like trying to change wheels on a car while the car is moving or refueling a jet in flight.
Unison lang is trying to solve this problem I think, by allowing multiple versions of a function to be available.
Migrations in databases are painful too.
One solution I've thought of which is probably overengineered is that API call sites are an abstract object and their schema and arguments is centrally deployed, I called this "protocol manager".
The idea is you write all your code to use a "span" and have contextual data in a span, and you can include or exclude data in a span with a non-software rollout. Your communication schema of RPC and API calls is a runtime decided thing, not hardcoded.
If you have N deployed versions of code and you want to upgrade to X, you have to test 1..N to X versions. So nobody does that.
I wonder if the way Microsoft's XBox is designed may be something to look towards in terms of hardware reliability/fallback. Specifically they utilize a Hypervisor which rarely needs updates, running different operating environments which need frequent updates.
- Better isolation of different parts of the system (e.g. infotainment unit, instrument cluster, et al).
- Better isolation for updates (e.g. run a "beta" update, and a "stable" update side-by-side).
- Automatic error detection and rollback (e.g. if a VM keeps restarting after an update).
- Ease of offering features like rollbacks to end-users.
- Rare hypervisor updates can be held to a much higher standard relative to other VM updates.
The only downside of hypervisor-based systems is slightly higher hardware costs. But even that is largely mitigated by modern architectures that natively support virtualization.
PS - You can also look to any containerization. I specifically brought up the XBox because it is a hardware product, just like a vehicle.
My 2019 car is not connected to the internet. Instead, I use Apple CarPlay for everything.
Is there any reason not to do it this way?
Wondering why there isn’t an option for a factory reset (eg. press and hold with a paperclip for 10 seconds)
Lexus did the very same thing about 8 years ago:
https://www.consumerreports.org/lexus/what-to-do-if-your-lex...
Miku baby monitors deployed an automatic firmware update that bricked nearly every monitor in use, but not for nearly a month after the update.
It forced the company into bankruptcy because they had to replace all of them.
I wish the economics of mass production didn't turn pennies into millions that need to be eliminated, because I've always thought the "don't disconnect from power" and "update bricks it" type problems could be solved by having extra EPROM to download into, the way linux keeps the previous kernel around after an update.
Or at least the ability to re-init/download from scratch, like a borked macbook disk. And hey, not the extra ability to do that, make it "the way it works" so you're always testing it.
This maybe crazy but if you're writing software for hardware that cost tens of thousands of dollars it should be impossible to brick it with an update, especially if that update is OTA.
The future is going great.
This is the new world we will be living in where you enter your car, only to find that something is broken because of OTA. While updates causing some bugs is ok in my phone but I don't want any bugs in my car. What happens if it messes up with safety systems? or what happens if OTA breaks my car that is out of warranty now? May be I'm the only one that is missing stable software in cars that once vetted, just keeps running as-is if nothing around it is ever changed (ideal scenario for an offline car).
An interesting thought experiment: what happens when these vehicles are out of warranty, and automakers accidentally send a vehicle-bricking OTA update? Isn’t that property damage?
What kinds of changes are generally included in these over the air updates? I have this sudden urge to shake my fist at a cloud and tell the gods that cars shouldn't need updates in the first place, if the car was ever deemed ready for production and then sold to customers for money. But, maybe I'm wrong, and it makes perfect sense. All I can think of would be something like a periodic update to navigation data, is that it?
It’s funny I was just talking to someone about a-b images slots and boots the other day and how they had written this test suite because there were so many potential places where partial updates could be interrupted.
Thousands of test points having to be verified was my understanding. That’s before even getting to the confirmed boot/watchdog aspect.
What a hassle, hope they like spending money on labor because it sounds like they are going to need to.
The vehicles are drivable but software and displays go black. It appears that the 2023.42 software update hangs at 90% on the vehicle screen or 50% on the app screen and then the vehicle screens black out. All systems appear to still work except for the displays.
This is what I do with my Prius to get a comfortably distraction-free driving environment. Sounds like a feature not a bug.
Can’t imagine how much it would suck to be the engineer who fat fingered it and caused a huge crisis for the company, inconveniencing tons of customers and costing millions. Even if there should be processes in place to prevent it in the first place, you’d still know you were the “but for cause” of the problem.
This is the kind of thing that keeps my awake at night.
Does anyone here have some practical tips to turn an embedded Linux machine into an appliance? The kind of system that a botched update cannot brick but only momentarily disable until a non-technical user presses a factory reset button of some sort.
/r/Rivian is a class act. I expected a wall of screaming, but instead entered a relatively calm room. People are upset, but there's no seething or flamewars, which is kind-of surprising given the cost of these trucks ($80k+, Range Rover territory).
> the vehicle is not bricked
What a time to be alive. Software updates (almost) turning cars into paper weights lol
Will insurance carriers cover damages due to botched updates? Imagine 10 years from now the power/control that electric delivery companies would have over retailers like amazon. One botched update away from a complete backup for delivery vans.
Tesla updates are sent in batches and you can opt in for advanced updates I guess to be earlier. Normally when I see that there is an update on Reddit then it takes 1-2 weeks at least to get to my car with the “advanced” updates on.
As somebody who has spent many years doing embedded+iot related to remote fleet firmware updates, this is the kind of thing that lurks in my nightmares.
I'd love to be a fly on the wall at Rivian engineering/operations this week!
need a easy way to do restore to previous version offline. take 100 bucks extra if required to have a backup ssd. Don’t want to be camping and then realizing i’m stuck because of some junior dev not being competent enough
> In most cases, the rest of the vehicle systems are still operational
Like what do you mean "in most cases" I can understand a broken infotainment needing reset but imagine if you had to tow your truck I'd be furious.
Can I please just buy a car with a motor and battery? Why does every god damn vehicle have to come littered with screens and chips all together like some tentacle monster?
All I need is a gauge cluster screen that can display the normal info like stored and heading while also letting me configure the cars performance and safety features. Then let me mount a double DIN radio that isn't dog shit. I've not seen a single new car with these dumb screens with a sound system that's not tinny muddy garbage with zero adjustment save for "bass" and "treble" settings. I mean all that technology and you can't be assed to put an eq in there. HVAC never needed more than two or three knobs anyway.
I'm going to have a chuckle next time I pass the Databricks billboard on 101 in San Francisco "Rivian powered by Databricks" or something to that tune.
What's the impact on your insurance should you get into an accident?
The speedometer screen is gone, so does that not imply the vehicle is inherently unsafe to drive?
Look at all these commenters saying "code signing was done wrong" when the wrong part is code signing at all.
As long as they are good for fixing it, this might what being a Pioneer or Early Adopter is about.
Poor title; physical repair is not required. Physical presence is required.
That’s funny, I just saw a job posting for Rivian Infotainment team
“we use leetcode to filter out hires because it works for us”
Ah this is why CarPlay isn’t worth adding, right?
As annoying as this, I find this laughable, too. Rivian updated users on the situation. Then, whines Electrek:
> That’s the last update we had over 10 hours after Rivian customer vehicles were fed the bad software update.
"Over 10 hours"!
I suppose it isn't Tesla, who yeets updates over the fence, that break new things, yeets another update that fixes that problem but introduces another one, then reverts back to two versions prior, before the issue. The Tesla that gets firmware fixes from vendors that have a test harness that should take 36+ hours to run, but says YOLO and flashes it onto a random car they have lying around and emails the vender back 3 hours later saying "LGTM, WFM, thanks!"
Honestly this makes me feel good, just because it always worries me that I don't see this type of issue being resolved more often. having to physically bring in a car seems like a near worse cast situation but it's good to keep this in our minds as a possibility.
i cant believe this sort of stuff is acceptable. what a clown industry
Inexcusable, really.
OTA on a car. What could go wrong?
This is tangentially Rivian related, but does anyone else see the inherent danger of stylized tail lights that are just a single red bar across the back of the car? Travelling on the freeway at night I can't really gauge the distance to the car in front of me if it's far ahead and if there's no discernable left and right brake lights. I'd believe Rivians and other cars like that are more at risk of high speed rear-end collisions.
Looks like this car brand is circling the drain. Glad I never bought into the hype.
Tesla. Rivian. All cut from the same cloth. A car should be simple. Yet we are stuffing all of this tech junk into it and trying to repackage is as something else to pump the numbers.
Car companies suck at tech. Let’s be realistic. They should stay their lane and focus on improving the car and physical aspects (safety, reducing carbon output, longevity, ease of repairability, reducing supply chain issues)
I built a whole remote software update mechanism for a control binary that ran on 25k+ servers across multiple data centers.
Rest assured that after the first time I messed it up (which required ssh into each box individually), I wrote a lot of unit and integration tests to make sure that it never failed to deploy again. One of the integration tests ensured that the app started up and could always go through the internal auto update process. This ran in CI and would fail the build if it didn't pass.
While I fully understand that this is hard to get right 100% of the time, a mess up of this level by a car manufacturer is pretty amazing to me.