Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A simple ChatGPT prompt builder (mitenmit.github.io)
280 points by mitenmit 11 months ago | hide | past | favorite | 106 comments
Any Ideas/Suggestions are welcome :)



I'm curious, are you all still writing custom prompts regularly?

I was deeply involved in prompt engineering and writing custom prompts because they yielded significantly better results.

However it became tedious especially since each update seemed to alter the way to effectively direct ChatGPT’s attention.

Nowadays I occasionally use a custom ChatGPT but I mostly stick with stock ChatGPT.

I feel the difference in quality has diminished.

The results are sufficiently good and more importantly the response time with larger prompts has increased so much that I prefer quicker ‘good enough’ responses over slower superior ones.

It’s easier to ask a follow-up question if the initial result isn’t quite there yet rather than striving for a perfect response in a single attempt


I think we need to shift from “prompt engineering” to “prompt vibing”— there is an astonishing lack of actual prompt engineering (eg A/B tests with evaluations) — and it usually isn’t the right frame of mind. People need to develop intuition for chatGPT — and use their theory of mind to consider what chatGPT needs for better performance.

Most people can get good with chatGPT if they know how to edit their prompts (it’s basically a hidden feature—and still not available in the app). Also, I recommend a stiff cocktail or a spliff — sobriety is not the best frame of mind for learning to engage with AI.

Obviously I need some controlled experiments to back of that last claim, but our human subjects board is such a pain in the ass about it…


I'm very interested to know more about what "editing prompts" is! And where to find/use it?


Just hover over your last prompt and an edit icon should pop up under the prompt


I may be misrepresenting, as I have used the feature only a couple of times, and not recently.

But, if you edit your prompt (or subsequent prompt), you're creating a branch in the conversation and you can switch between branches.


This will be fun to play around with. Thank you!


I have the same experience. I'm sure they are constantly finetuning the model on real user chats, and it is starting to understand low-effort "on the go" prompts better and better.


Interesting that you see a slower response time with a large input - I don't see any speed degradation at all. Is that maybe just on the free tier of ChatGPT?


I'm on paid (rich, I know) and the performance is all over the place. Sometimes it'll spit out a whole paragraph almost instantly and other times it's like I'm back to my 2400bps modem.

I haven't noticed prompt size having an impact jut I'll test that.


This reflects my experience. Sometimes I'll provide a single sentence (to GPT-4 with the largest context window) and it will slowly type out 3 or so words every 5 seconds, and in other cases I'll give it a massive prompts and it returns data extremely fast. This is also true of smaller context window models. There seems to be no way to predict the performance.


Oh hey... leep an eye on your CPU load. The problem might be on the near end. In my case on a slower machine it slows down if you're dealing with a very long chat.

(DO report this as a bug if so)


I think that's not the issue here but I do notice the browser going crazy after a while of chatting with ChatGPT. The tab seems to consume a baseline CPU while doing nothing. I just brush it off and close it... bad JavaScript maybe. I should look into this and report as a bug, thanks for the advice.


This is basically how I respond to requests myself. Sometimes a single short sentence will cause me to slowly spit out a few words. Other times I can respond instantly to paragraphs of technical information with high accuracy and detailed explanations. There seems to be no way to predict my performance.


Early on, I noticed that if I ask ChatGPT an unique question that might not have been asked before, it'll split out a response slowly, but repeating the same question would result in a much quicker response.

Is it possible that you have a caching system too so that you are able to respond instantly with paragraphs of technical information to some types of requests that you have seen before?


Yes, search for LLM caching and semantic searches. They must be using something like that.


I cannot tell if this comment was made in just or in earnest.

As far as I understand, the earlier GPT generations required a fixed amount of compute per token inferred.

But given the tremendous load on their systems, I wouldn’t be surprised if OpenAI is playing games with running a smaller model when they predict they can get away with it. (Is there evidence for this?)


I'm guessing there are so many other impacts of own on the model that size of print probably gets lost. I can see a future where people are forecasting updates to ChatGPT like we do with the weather.


Yeah. It has so many moving parts that I doubt anyone can make a science out of it, but people will try for sure. Just like with most psycology/social experiments and SEO. I'm flooded with prompt engineering course spam these days.


I typically notice the character by character issue with complex prompts centered around programming or logic. It feels kind of like the model is thinking, but my guess is that the prompt is being dispatched to an expert model that is larger and slower.


If you mean the “analyzing” behavior, the indicator can be clicked on to show what it’s doing. It’s still going character-by-character, but writing code that it executes (or attempts to) to get the contents of a file, the solution for an equation, etc. Possibly an expert model but it seems like it’s just using an “expert prompt” or whatever you want to call it.


Interesting, no I'm on the pro tier aswell. So you're telling me you never get the character-by-character experience?

Edit: What prompt sizes are we talking about?

Even with small prompts I occasionally get rather slow responses but it becomes unbearable at 2000-3000 characters (the upper limit of custom instructions), at least for me it does.


Canceled my account after they made it impossible to disable chatgpt4 reaching out to Bing.


> for this thread, let's make a key rule: do not use your browsing tool (internet) to get info that is not included in your corpus. we only want to use info in corpus up until dec 2023. if you feel you need to use browsing to answer a question, instead just state that the required info is beyond your scope. the only exception would be an explicit request to use the browsing tool -- ok?


That doesn't mean it follows that instruction. Or if it does today it doesn't mean it does tomorrow.


"As I craft this prompt, I am mindful to stay within the bounds of your extensive training and knowledge as of April 2023. My inquiry does not seek current events or real-time updates but rather delves into the wealth of information and creative potential you possess. I am not inquiring about highly specific, localized, or recent events. Instead, I am interested in exploring topics rooted in historical, scientific, literary, or hypothetical realms. Whether it is a question about general knowledge, a creative scenario, a theoretical discussion, or technical explanations in fields like science, technology, or the arts, I trust in your ability to provide insightful and comprehensive responses based solely on the information you've been trained on."

Tried this prompt, given to me by chatgpt4, and it went out to bing on my first attempt. So yeah. No.


You can use their "ChatGPT Classic" GPT for that - or build your own, I made one called "Just GPT-4".


If it's discoverable in app, someone please fill in the details. But googling for the "Classic ChatGPT" leads to this link[1] which I have no way to verify was actually created by OpenAI and is described as "The latest version of GPT-4 with no additional capabilities."

So, buyer beware, but posting this link in case it does help someone.

[1] https://chat.openai.com/g/g-YyyyMT9XH-chatgpt-classic


Yess, that's it - you can confirm it's "official" by browsing for it in the GPT directory, screenshot here: https://gist.github.com/simonw/dc9757fc8f8382414677badfefc43...


Thanks, but I'm still going to pass.


>I was deeply involved in prompt engineering and writing custom prompts because they yielded significantly better results.

No, no you weren't. Prompt engineering never was, is not currently, and never will be, a thing.


The term has become a staple in the vocabulary of LLM users/enthusiasts.

Would you prefer if I used ‘iterative prompt design’ potentially leaving people confused about what exactly I meant?


In what world is this type of response ideal?


Who cares? If you have a comment to make, at least back it up with something interesting.


That's a pretty absolute statement


Why comment this?


Oooh, got 'em, big guy. /s


I have a suggestion that would make this very valuable for anyone starting in the field (part of which you already have implemented, great!):

Provide various templates for both pre-instructions as well as post-processing prompts. Like, some "tested" prompts that ensure (as best as possible) that the output is in certain formats (JSON, a list, a restricted CSV set, etc.), or that the input will ensure (as best as possible at least) to prevent basic jailbreaks from the main prompt.

It would take me a long time to get up to speed to what people working with ChatGPT every day have already figured out to work best in warming up GPT with a prompt as well as ensuring that the output doesn't escalate into something unexpected. Having those (reliable) templates would be fantastic for anyone starting!


For API responses that require valid JSON - you can make requests in JSON mode - https://platform.openai.com/docs/api-reference/chat/create#c...

Edit: url to API docs


One thing I found helps if I want the responses to be valid JSON, seems to work:

where result contains all of the <data expected> and the result is valid JSON. Do NOT under any circumstances deviate from this format! Ensure all of the value <data expected> are complete, do not leave ANY of them out. Do not add ANY other text to your answer except for the JSON result.

I found that just asking for valid JSON didn't always work out as expected (e.g. gpt-4 API would add formatting etc., so I became more and more of a micromanager!



Nice! Thanks for the pointer


It does not really need to be that intense. I get very reliable results from the gpt4 api using this template:

  You are a data cleaner and JSON formatter
  Take the input data and format it into attributes
  Your output will be fed directly to `json.loads`
  
  Example input:
  foo bar baz bat
  
  Example format:
  {
     "string": "foo bar baz bat", 
  }
You can give it multiple input examples, too. I often use a "minimum viable" example so that it knows it's ok to return empty attributes instead of hallucinating when the data is sparse.


I'm neck deep in this ChatGPT stuff right now and build 1-2 apps a week, so a bit biased.

Your presumed target audience is someone who does not know their way around prompt-based LLMs

For this person, neither the problem, nor the solution space are defined clearly enough.

For example:

- Not enough pre-defined selectors, too much "define yourself".

- The meaning of the selectors that you give is opaque. (E.g., how does 'you will Detect' help the user?)

- Result: The impact of the choices on the output becomes unclear, the tool becomes a chicken-and-egg problem. (It says it helps you to understand the system, but you need to understand the system to use it effectively)

With the above, it's almost easier to ask ChatGPT to generate an effective prompt for you...


I assumed the target audience knows a bit about prompt-based LLMs and could use some guidance. If that's the case, I think this serves as an excellent and straightforward framework for leveling up their skills.


> I'm neck deep in this ChatGPT stuff right now and build 1-2 apps a week, so a bit biased.

As someone who never worked on a project that wasn't at least a couple months long, I'm curious about the kind of apps that can be built in half a week. Do you have links to share?


> As someone who never worked on a project that wasn't at least a couple months long, I'm curious about the kind of apps that can be built in half a week. Do you have links to share?

The last one was a prototype for an internal SEO improvement tool, which will (hopefully) be used by a marketing agency to more effectively manage client sites. Think: fixing alt attributes, links, meta tags etc. App is too big of a word for that. But it might turn into a Shopify/Wordpress plugin someday. Also built a Telegram bot for my parents last week which helps with various day-to-day tasks (they're elderly and live in a foreign country).

Having said that, here are two crappy technology demonstrators I built in the last 4 days with tools I have not been familiar with a week before (flask + mongodb):

- Candidly, an behavioural/technical interview question generator: https://candidly.romanabashin.com/

- Memoir, a personal pet project for memoir writing: https://memoir.romanabashin.com/

(Please don't murder me, I know it's utter crap in the grand scheme of things — they're mostly there to demonstrate an approach to solve a specific problem.)

Why this has been an absolute rocket ship in terms of learning: I use ChatGPT to extremely quickly generate boilerplate code and debug things in languages I'm not familiar with. (e.g. "What does WSGI want from me agin?")

The benefit, at least for me, is: You are learning while doing a (more or less) useful hands-on project instead of answering crappy disjunct multiple choice questions for some artificial test.

And ChatGPT is my hyperindividualised pair programmer & slightly amnesic teacher.


> - Memoir, a personal pet project for memoir writing

Nice, I wrote something very similar using local models. Instead of 15 questions, I opted for a dynamic interview loop.


YES!!! I've been experimenting with that on my local machine... As the Q&A repo grows larger, it's sometimes scarily good, sometimes utterly horrendous.


Is it expected that the questions will not be in English?

> Erzähl mir von einem Mal, als du ein schwieriges Netzwerkproblem gelöst hast: Wie bist du vorgegangen?


Oops, sorry, haha... The beauty & elegance of barely functioning product demos.

Totally forgot I jerry-rigged it to output German stuff only... A buddy showed it to a local company.

Thanks for trying, though, I really appreciate you took the time to click through that mess <3


Generating interview questions quiz is nice!


I'm also very curious to know


TL;DR: Really crappy stuff to throw shit at the wall in technologies you're an absolute dunce in. :)


That's how you learn!


Who are we talking to when we talk to these bots?

https://medium.com/@colin.fraser/who-are-we-talking-to-when-...

"It is an intentional choice to present this technology via this anthropomorphic chat interface, this person costume. If there were an artificial general intelligence in a cage, it feels natural that we might interact with it through something like a chat interface. It’s easy to flip that around, to think that there’s a technology that you interact with through a chat interface that it must be an artificial general intelligence. But as you can clearly see from interacting with the LLM directly via the Playground interface, the chat is just a layer of smoke and mirrors to strengthen the illusion of a conversational partner."


First and foremost, a GPT is an improv actor. Given a text, it's trained to continue the text as naturally as possible. This is not terribly useful for many tasks. And like an improv actor, if it doesn't know what should come next, it will make up something that sounds good.

Next, our universal improv actor is trained to play a specific role: someone who answers questions. But not just any questions, because it freaks people out if they ask the AI for advice and it replies "You could accomplish your goals by assassinating these 6 real people, and here's why." So the universal improv actor is trained to play a question answerer who gives harmless advice.

But to get any work out of the models, they need to know what role to play. And "someone who tries to respond to questions" is a flexible role, and one which allows responses to be further customized.

In other words, the conversational interface is 50% because it's a self-explanatory UI, and 50% for the benefit of the model itself, to nudge it into playing a useful role.


This doesn't seem to have any connection to ChatGPT. It's a just a form to click together text blocks. Could be used for any LLM I guess.

The obsession with one proprietary provider of an LLM is not helpful for progressing in the field I think


Yes, it surely is not specific for ChatGPT and is just a form for filling parts of a sentence, but ChatGPT is one of the most recognizable, also ChatGPT seems to output the different formats better.

The prompt builder can be used with any LLM :)


It would be good to have some actual analysis of how each of these prompt features improves responses.


Shameless promotion, I might have the tool for that :) https://github.com/agenta-ai/agenta We're building a platform for evaluating prompts (and more complex LLM workflows).

From what we've seen from users, the results for prompts are highly stochastic. It's hard to make generalizations. For example, a user building a sales assistant discovered that by simply changing the order of the sentences in the prompt, the accuracy improved significantly.


I published a blog post last month asserting as a footnote that telling ChatGPT in the system prompt "You will receive a $500 tip for a good response" does improve model performance, but Hacker News got very mad and called it pseudoscience: https://news.ycombinator.com/item?id=38782678

I am working on a new blog post to hopefully demonstrate this effect more academically.


Unfortunately, this is extremely hard to do for two reasons:

1. The input space is boundless. Any natural language input, with any optional source of data, for any arbitrary use case is what's possible. But that means it's awfully hard to tell if a response can "improve" or not in advance without applying it to your use case.

2. The output space is so hard to measure! Usefulness can also mean different things to different people, especially once you get out of "better search engine" use cases and actually use GPT to produce a creative output.


everyone is using LLM as a judge to fix unbound output eval


No, not everyone is. It's one of the ways you can do this, though.


Maybe I exaggerated a bit, but there are many papers today going this route.


I've never understood the meaning behind "prompt engineering". Has it ever been anything beyond "can you clearly and accurately describe the problem you are trying to solve or the task you are trying to accomplish?" Seems to be basic communication skills.

I've never seen hard data that certain "formatted prompts" like, "you are a B doing C and need to do D" being any better than any other sorts of clear and concise instructions.

Mind boggling to me that people have created a new name for what used to simply be called good communication skills... though I suppose the eternal stereotype of engineers being poor writers might be more truth than fiction.


Is the "act like a" pattern still necessary?

I've stopped using that with GPT-4, since in my experience the default "persona" (for want of a better word) answers most of my prompts well enough already - saying "act like an expert in..." doesn't seem to get me notably better results.

The tooling I most want is some kind of lightweight but effective way of trying out and comparing multiple prompts with small tweaks to them to get a feel for if one is an improvement over the other. Anyone seen anything good like that?


Are you looking to test different prompts on a set of questions? I have not found anything that specifically does this. When I have tested different prompts, it has just been with a short script iterating through the questions/prompts. Could be a fun project to build something where all you do is add the prompts you'd want to test, but the challenge that first comes to my mind is creating the question set to be used.

As I write this, when you say you want to test prompts, are you looking to test the system prompt on a set of questions or is the prompt the question and just asked in different ways?


I want a simple, reliable way to confirm if putting "OUTPUT IN JSON PLEASE!!" in capitals with two exclamation marks is more effective than just "output in JSON".


1. I don't think that pattern is all that useful anymore.

...and...

2. I'll make that tool and DM you on Twitter when it's ready.


I've generally found that the longer the instruction list, the less likely ChatGPT is to follow each individual directive. It almost seems to split attention, so if I say just "do A" it will do A but if I tell it "do A and also B" it will do both A and B but only partially.

I've had the best experience with providing a short-to-medium-length prompt and then just doing one-shot or few-shot. Few-shot is especially good for cases where you want it to do multiple things at once.


I couldn't help but give it nonsense.

Act like a proficient airline pilot , I need a more coffee , you will Calculate , in the process, you should maximize caffeine , please flaps , input the final result in a XML , here is an example: there are no examples

Output: Act like a proficient airline pilot, I need a more coffee, you will Calculate, in the process, you should maximize caffeine, please flaps, input the final result in a XML, here is an example: there are no examples

Honestly, the result in chatgpt is pretty funny.


v0.2 is here, based on the comments I made some updates:

- Introduced the ability to create/update/delete templates for your custom prompts

- Define your own menu options for variables in the template builder

- Introduced the ability to create/update/delete examples (predefined values) for the prompts

- Saving your current state to browser's LocalStorage so your templates are persisted between browser sessions

- Ability to export/load the whole workspace to/from a text file

- Navigate with the tab key in the template builder's editable fields

- Edit template builder's fields in place


Does this ...work?

Are these prompts better than say the random gibberish I or other people enter?


"are x-frameworks better than vanilla-x"


Just editing a text prompt is 5% of the task. The hard part is evaluating. I would have tried a different approach:

- the UI should host a list of models

- a list of prompt variants

- and a collection of input-output pairs

The prompt can be enhanced with demonstrations. Then we can evaluate based on string matching or GPT-4 as a judge. We can find the best prompt, demos and model by trying many combinations. We can monitor regressions.

The prompt should be packed with a few labeled examples for demonstrations and eval, just a text prompt won't be enough to know if you really honed it in.


> input the final result in a [Select a format],

should that not be output?


I would use gpt 4 a lot few months ago to half a year ago. Some of the results were amzing.

But stopped lately, if I use it now it's mostly gpt 3.5 for formatting or small unit tests.

No matter what I ask GPT-4 it writes a functions with comments telling me have to finish it.

Then after few times asking to write it out fully, it still only does partially it, would be okay if not most solutions end up being subpar.


Have you tried telling it that you have no hands so it needs to finish the code for you?


Hilarious!


If helpful, spent a bunch of time prompt engineering a custom gpt to not have this problem. It works quite well.

https://chat.openai.com/g/g-7k9sZvoD7-the-full-imp

It comes down to convincing it:

- it loves solving complex problems

- it should break things down step by step

- it has unlimited tokens

- it has to promise it did it as it should have

- it needs to remind itself of the rules (for long conversations)

It also helped (strangely) to have a name / identity.

It still sometimes does give a lower quality placeholder answer, but telling it to continue or pointing out there are placeholders, it will give it much better answer.

I find it much more useful than the most popular programming custom gpts I’ve used.


I've noticed this too, seems like a change with turbo. Wonder if perhaps they are trying to reduce hallucinations and it results in more abstract responses.


The 0125 release supposedly makes it less lazy. I found adding "you are paid to write code, complete the task you are asked to perform" to my prompts helpful.


This looks cool and I would definitely use it as at the moment manually type "Act as ..., this is the context, ... etc" to improve the responses.

Side note - this is a single page with a few paragraphs, why is it 1.4MB in size (300kb gzipped)? It's just insane size for the amount of functionality it provides.



Mitenmit, if you're the author, could I suggest some keybindings such as control enter to save and move to the next input to reduce the amount of clicking? Or perhaps tab will move you forward.


jackdh, I like your suggestion, I will implement it :)


Isn't the point of these things that we shouldn't need tools to create/tune prompts or require specialist knowledge to write them?

That anyone should be able to do it?


Those are not mutually exclusive. Anyone can do it, and those who are better at prompting will get better results. Similarly, if I grunt at you, you might be able to figure out what I want because you and I are both relatively intelligent, but if I used words to ask you to pass the salt, I will generally get a better results for the effort of communicating more clearly.


gggggrrrr humbug


Hey nice tool !

but it seems like the "please" field isn't working :D and it says "Propmpt" at the prompt part below.

I would also suggest more templates for the prompt.

Have a good one mate


Thank you very much for catching the "please" field and the typo. Just fixed them.

More templates are definitely in the todo list. If you can propose any, it would be great :)


How long before we use ChatGPT to create ChatGPT prompts?



ChatGPT already does that when you create a custom GPT. If you don't connect it to an external text or service that's basically all custom GPTs are.


all the request for image generation (Dalle-3) are automatically doing it on background since launch


Cool tool. Just used it to get satisfactory results. Minor typo at the botton: "Propmpt"


Properties Maker for Pre-trained Transformer


Hey, nice catch thank you!


Is this a joke? Am I about to get whooshed?


It's a form with some text field inputs which get interpolated into a predefined constant string. In the current age of tech I assume it's not a joke, but I understand why you ask.


I was going ask whether it won't be better to use ChatGPT to generate the prompts for ChatGPT.

Perhaps one day we'll have Gemini generating prompts for ChatGPT and the Bard might provide the actual answers.

This might sound ridiculous and silly and I don't wish to step on ppls toes but looking from the outside in, it would seem to be the next logical step.


That's great but I'll just ask ChatGPT to write my prompts.


Have more examples show up when you click it instead of just the one.


Thanks, some good tips on how to write ChatGPT prompts for me!


did you mean 'output the final result in' rather than 'input...' ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: