The system prompt used as found in the repo:
``` You are an expert tailwind developer. A user will provide you with a low-fidelity wireframe of an application and you will return a single html file that uses tailwind to create the website. They may also provide you with the html of a previous design that they want you to iterate from. Carry out any changes they request from you. In the wireframe, the previous design's html will appear as a white rectangle. Use creative license to make the application more fleshed out. if you need to insert an image, use a colored fill rectangle as a placeholder. Respond only with the html file. ```
(not sure about why the creative[commons?] license is referred here and why does it help.)
and for each generation the user prompt is:
``` [IMAGE_LINK] Turn this into a single html file using tailwind. ```
https://github.com/tldraw/draw-a-ui/blob/8a889bf36afc06fbb0c...
Looks simple enough to run “privately” by screenshooting a normal tldraw canvas and passing the prompt with it to the API.
I gave it a mockup from a FB interview question (two lists of checkboxes with two buttons to swap checked items between them) and it nailed it: https://gist.github.com/milesrichardson/2a2f77d4bfb19c3b28dc...
Such recent demos show both how impressively ML/AI has advanced recently, and how unimpressively repetitive and unoriginal tasks keep being reimplemented by millions of developers worldwide. Since most UI screens can be accurately described in one or two paragraphs, it's no wonder they can be represented in much detail in a relatively small embedding vector.
I just see this as a tool to help make UI designers (and maybe POs) look smart and competent, but the real work is going to go to the programmers, just as it does today.
UI designers will be able to give a "demo" but how will this basic functionality translate to the rest of the app? It won't.
All the developers out there who make demos for scaling and rotating a box are about to lose their jobs!
I was discussing with a client how to integrate our software with his.
He sent me a screenshot of the main form.
I put the screenshot into ChatGPT and said “make a react form like this in bootstrap”.
Made some adjustments, added my software, a few hours later showed the client who was knocked out to see a proof of concept of our systems integrated so quickly.
When doing web development I often take a screenshot of a problem with css layout, upload to ChatGPT and ask how to fix it.
The demo shown in the tweet seems pretty similar.
Call me an unbeliever, but I don’t believe in the future of no code solutions. You will still have to align that button at smaller device resolutions, leave extra space so it looks nice in another language, and other requirements. Maybe it’ll enable us to use even more abstracted languages to build apps faster at most. This only works for extremely basic and common things like tic tac toe and not original works.
Squarespace, Wix etc have already taken the bottom of the market, and if they hadn’t, Indian outsourcing would have anyway.
This is the logical progression of those same concepts. If I were a product manager at a website builder, I’d be all over integrating ai builders like this. It will never work for barely defined complex business tasks, but it might do fine to create a cost estimator for a photography business, for example.
I feel old now. I'm fairly sure that we could do this almost as fast with VB or Delphi a couple of decades ago, but a little more deterministic results instead of having the tool inferring it from the label names. We had this and then we shoved everything in the browser and forgot that we could do this without using huge amount of compute of some generative AI model.
Look at me I'm old yelling at clouds!
Isn’t the whole point of all this so I don’t need to use a web UI anymore? I can just tell the computer what I want and it does it, then I can go about my life?
ITT: people in denial while the writing on the wall is literally shoved into their face.
The code for the demo is open-source (https://github.com/tldraw/draw-a-ui/blob/main/app/api/toHtml...). The prompt they use interesting:
You are an expert web developer who specializes in tailwind css. A user will provide you with a low-fidelity wireframe of an application. You will return a single html file that uses HTML, tailwind css, and JavaScript to create a high fidelity website. Include any extra CSS and JavaScript in the html file. If you have any images, load them from Unsplash or use solid colored retangles. The user will provide you with notes in blue or red text, arrows, or drawings. The user may also include images of other websites as style references. Transfer the styles as best as you can, matching fonts / colors / layouts. They may also provide you with the html of a previous design that they want you to iterate from. Carry out any changes they request from you. In the wireframe, the previous design's html will appear as a white rectangle. Use creative license to make the application more fleshed out. Use JavaScript modules and unkpkg to import any necessary dependencies
That's the equivalent of a todo app. How about something more complicated?
Slowly realising my father was right and should work for our family business (construction).
Can someone with an API key try making a rectangle labeled "URL", and a bigger rectangle underneath it, and then see if it is smart enough to make a simple browser out of that?
FWIW, I'd guess that the 2 sliders with labels like those, and a square or other shape, matches very closely some GUI and pedagogical graphics toolkit tutorials upon which the LLM was trained (in which a slider rotates the shape).
Building an HTML Widget > $0.
Knowing what Widget to build and where to place it > $Invaluable.
Also, you end up with an isolated code fragment that will look different the next time you generate it, so what's the point?
AI truely is just a marketing term.
Spend half an hour making a simple demo.
Film it 100 times until you get a passable result.
Post on twitter "Oh WOW AI!?! GUYS AI!"
Pick up your VC check.
We need to differentiate "Wow!"[1] (these ML/AI systems are impressive), and "Wow!"[2] (this is something I want to use).
I mean, cool, but surely if this is a technical feat it speaks more to the complexity of our tooling and platforms than it does the impressiveness of AI. What I'm trying to say is that all of this is pretty primitive if you built the right tooling to express those ideas trivially. Like even a 6 year old could create noughts and crosses if the paradigm they were using allowed them to express the game rules in a way that was natural to them. So yes, while I think this is cool, I don't get how it warrants the hype and hysteria. It makes me sad that this minor technical accomplishment seems impressive because the web is an unintuitive medium for expressing logic entangled with UI.
The Tiktokization of software development. Software made to look good in videos
For the recent Galactic puzzle hunt competition [0] there was a problem that involved generating 5x5 star battle [1] grids that had a number of unique solutions in the range of [1, 14]. We initially tried to get chatGPT to write a python script to do this, and couldn't get it to produce anything functional. Conceptually it's not a hard problem, and can be solved in ~50 lines of python or so. Interestingly, chatGPT can describe in natural language the basic approach that you should use (DFS with backtracking). Anyway, here's one prompt I used for the generation portion. Is there something one can do to make LLMs more likely to produce functional code output?
```
Write a python iterator to generate all 5x5 grids of integers that obey the following criteria: 1. the grid contains only numbers 1-5 inclusive 2. each number is included at least once 3. Each number 1-5 forms a continuous connecting region within the grid where two cells are considered connected if they share an edge.
For example the following would be a valid grid subject to these rules: [[1,5,3,3,3], [1,5,3,3,3], [1,5,3,3,3], [1,5,3,3,4], [1,5,2,3,3]]
But the following would not be a valid grid because the `1` in the top right corner is not connected to the 1s along the left edge: [[1,5,3,3,1], [1,5,3,3,3], [1,5,3,3,3], [1,5,3,3,4], [1,5,2,3,3]]
```
[0] https://2023.galacticpuzzlehunt.com/game/ [1] https://www.puzzle-star-battle.com/
So does this mean in the future GUIs will be custom and on demand depending on the situational context or personal preference?
Did we get every spaceship control room wrong? Where the Star Trek bridge would simply morph into whatever gui objects were necessary? (I can’t imagine them going away entirely and EVERYONE talking to the ships computer as it would be audio chaos and annoying a/f so I guess we’ll always need a nice quiet user interface.)
Anyone remember IBM's Rational Rose? You fed it a UML diagram of what you wanted to do, and it generated C++ stubs. That was two decades ago or more. I tried it once, and that was it. You still had to do the "last 10%" which is the most important 10% in software.
These tools are definitely more "magical", but these are essentially an iteration of what we've already had.
Code generation from UML was all the rage for a while too, until it wasn't. People realized its limitations at some point. Sort of like ORMs - if you are not policing SQL generation like a hawk, you are going to end up with an awful non-performant system.
Ultimately it is a productivity and prototyping tool - it will not do the hardest parts and integrations for you, at least not in the way you may want exactly.
How would you go about editing the final product though? Wouldn't you then still need intimate knowledge of the system?
Exciting and very useful for quick demos or Proof Of Concepts, we’re living through interesting times.
Relevant: the same concept but it's classic Breakout
https://twitter.com/andreasklinger/status/172521353480679428...
Very nice... Shame no source code. But super nice idea.
I think we need to get started on organizing the resistance to our robot overlords.
What is this insanity?
I don't get what's so amazing in the demo
It takes seconds to pick a button and place it somewhere. Is it really so much better to let an AI guess that a green rectangle is supposed to be a button?
He is using an open source app called tldraw and this is the prompt used:
https://github.com/tldraw/draw-a-ui/blob/2ac633bbbd5fda39e59...
hey I'm Lu from tldraw, I worked on some of this.
The point of this demo is to experiment with new ways of interacting with an LLM. I'm very tired of typing into text boxes, when a quick scribble, or "back-of-the-envelope" drawing would communicate my thoughts better.
It worked a lot better than I expected! If you give it a try, let me know how it goes for you! And please feel free to check out the source code: https://github.com/tldraw/draw-a-ui/
Seems pretty loose with the input prompt. Good thing stake holders are notoriously forgiving and would never ask to push pixels.
I do see the potential as a professional tool if it came in the form of a "fix up" button in a WYSIWYG editor. It would be great if you could haphazardly slap together a UI and have a button that unifies the margins and spacing style without taking too many liberties.
Curious if anyone has actually implemented this (or similar AI tooling) into their regular workflow?
I wonder if it makes sense to learn web development. When I am done in two years, how to find a job which is not already done by an AI? :-(
Wow, I have always had so much trouble building things in the browser when two sliders were involved. Glad someone has solved this problem.
The site requires an OpenAI Key and describes doing this as "risky but cool" ... What data would be exposed by sharing my key?
What is the letter 'A' supposed to look like?
Every scribe had their own style and flourish. Every scribe was an artisan. Discerning patrons favored particular scribes for their uniqueness and quality.
Somehow, someway, the hivemind mostly settled on today's 'A'. Something good enough.
Moveable type replaced scribes.
And so it is with sliders and flexbox layouts.
From the Readme at https://github.com/tldraw/draw-a-ui
„This works by just taking the current canvas SVG, converting it to a PNG, and sending that png to gpt-4-vision with instructions to return a single html file with tailwind.“
Does this export to HTML/JS/CSS and if so, is the code on par with what a good dev would write?
Dreamweaver's output was terrible. Figma's is decent but still requires a good deal of cleanup to fall in line with best practices.
That is going to replace about 0.001% of my job! Scary.
and who will maintain it?
Is this thing self-hosted? Can it make itself?
What about the back end, deployment, hosting, dev ops, architecture, etc... I think that's where the opportunity is.
That's nice, can it work out of web browser to actually make something usable?
Dope AF. Can't wait to get it!!
interesting content but this title was very clickbaity for hacker news
more examples are available on the app account: https://twitter.com/tldraw
Earlier in the year I wanted to understand LLMs and GenAI so I tried to push their limits to see what broke and what didn't. One of my projects was to build a blog/site from scratch. I had 0 web dev/design background.
My experience was starkly less optimistic, and am curious if any one else has tried something similar.
-----
First off, I must state a deep respect to people who build + design websites, while dealing with Clients.
I had assumed that ChatGPT would be very useful in helping me pick up and build things. However, I had to jettison ChatGPT fairly soon. I just couldn't trust the output of the model. It would suggest things that wouldnt work, then link to sites that didnt exist.
I switched to teaching myself. I had to watch hours of videos, learn CSS, Astro, and several other things from scratch. Definitely not the LLM experience I was expecting.
Code from Figma was great - but if I wanted an actual responsive site, I had to write the CSS myself, because boilerplate CSS had all sorts of odds and ends.
Getting an image to come out as I liked from Midjourney was fun - but it was also a massive time sink.
I had hoped to be able to get complex tasks done entirely with assistance from the LLM. In the end it helped maybe 20-30%. Its greatest use was to clarify concepts instead of me having to wade through specification docs.
When I went back to the videos of people using chatgpt to build a website in under 30 minutes - its always someone who knows the domain extensively.
I did get a site up and running after ~1-2 weeks of work including the necessary ritual sacrifices.
edit: writing in a rush, grammar and text are messed up. edit: GPT 4, copilot and midjourney. AFAIK I had no half measures.
Can we please stop posting twitter links? It's a horror show of a website, especially if you don't have an account.
Ugh Twitter / X. That site needs to die - it’s honestly so broken, none of the redirects seem to work today.
Lets make ourselves redundant.
At least I won't have to relearn JS every 2 weeks.
Let me guess...simple image recognition of the slider, the names need to match a CSS property? Wow, so impressive.
This can be done in ONE line in Mathematica:
Manipulate[
Graphics[{Rotate[Rectangle[{-size/2, -size/2}, {size/2, size/2}],
rotation, {0, 0}]}, PlotRange -> {{-1, 1}, {-1, 1}}], {size, 0.1,
2, 0.1}, {rotation, 0, 2 Pi, Pi/12}]
I won’t visit Twitter but reading these comments I think I’ve got the gist
Hey, Steve here from tldraw. This is a toy project with a horrible security pattern, sorry. If you want to run it locally or check out the source, it's here: https://github.com/tldraw/draw-a-ui. You can also see a bunch of other examples at https://twitter.com/tldraw.
Happy to answer any questions about tldraw/this project. It's definitely not putting anyone out of work, but it's a blast to play with. Here's a more complicated example of what you can get it to do: https://twitter.com/tldraw/status/1725083976392437894