How to Write Prompts That Use Fewer Tokens

Simple, practical techniques to dramatically reduce your ChatGPT token usage without sacrificing the quality of responses.

How to Write Prompts That Use Fewer Tokens

Every token counts — whether you're hitting context limits mid-task or managing API costs on a project. The good news is that reducing token usage doesn't mean getting worse results. In most cases, leaner prompts produce better, more focused responses.

Here are the most effective techniques, ranked by impact.

Tip 1: Convert Your Documents to Markdown Before Pasting

This is the single highest-impact change you can make and most people never do it.

When you paste a raw PDF, Word document, or spreadsheet into ChatGPT, you're feeding it binary formatting noise alongside the actual content. A 15-page research paper pasted raw can consume 18,000 tokens. The same paper converted to clean Markdown uses around 6,500 tokens — a 64% reduction with zero loss of information.

inktomd.com converts any file to clean Markdown in seconds. Supports PDF, Word, Excel, PowerPoint, EPUB, and 19 other formats. Free, no signup required.

The workflow takes 30 extra seconds and saves thousands of tokens every time:

Upload your file to inktomd.com
Copy the Markdown output
Paste the Markdown into ChatGPT instead of the raw file

If you do nothing else on this list, do this one.

Tip 2: Use Bullet Points Instead of Paragraphs in Your Prompts

Prose is token-expensive. When you write a prompt as flowing paragraphs, you use filler words, transitions, and repetition that ChatGPT doesn't need.

Compare these two prompts asking for the same thing:

Verbose version (47 tokens): "I was wondering if you could help me take a look at this document and maybe give me some thoughts on what the key points are and whether there are any areas that could be improved in terms of clarity."

Lean version (18 tokens): "Review this document. List: key points, clarity issues."

Same task. up to up to 63% fewer tokens on average on average. Often a better response because the instruction is clearer.

Tip 3: Be Specific About What You Want

Vague prompts produce long, exploratory responses that eat through your context window. Specific prompts produce targeted, useful answers.

Instead of: "What do you think about this report?"

Try: "Summarize this report in 3 bullet points. Focus on the financial projections in section 4."

The specific version uses fewer tokens in the prompt, generates a shorter response, and the response is actually more useful.

Tip 4: Remove Irrelevant Sections Before Pasting

If you have a question about the methodology section of a research paper, you don't need to paste the entire paper. Extract the methodology section, paste only that, and ask your question.

This sounds obvious but almost nobody does it. People paste entire documents out of habit when they only need 10% of the content for their actual question.

Quick habit: before pasting any document, spend 30 seconds deleting the sections you don't need. Introduction, bibliography, footers, appendices — gone. Your context window will thank you.

Tip 5: Avoid Repeating Context

If you've already explained your situation or pasted a document earlier in the conversation, don't re-explain it with every follow-up question.

Bad: "Based on the sales report I showed you earlier, which was the quarterly revenue breakdown for our three product lines across the APAC region, could you tell me which product performed best?"

Good: "From the data above — which product performed best?"

The context is already there. Reference it briefly, don't re-paste it.

Tip 6: Use System-Level Instructions Efficiently

If you're using ChatGPT's system prompt or custom instructions, keep them concise. Every word in your system prompt is prepended to every single message in the conversation. A 500-word system prompt that could be 100 words is burning 400 extra tokens on every turn.

Write system prompts like a style guide, not an essay. Rules and preferences, not explanations.

Tip 7: Split Complex Tasks Into Focused Sessions

One long conversation with many different tasks accumulates token debt quickly — every new message re-sends the entire history.

Instead of one long conversation covering five different topics, have five shorter focused conversations. You get cleaner context, better responses, and much lower token consumption per task.

Tip 8: Ask for Shorter Outputs When Length Doesn't Matter

ChatGPT defaults to comprehensive responses. If you just need a quick answer, tell it:

"Answer in one sentence."
"Give me a bullet list only, no explanation."
"Keep your response under 100 words."

The model follows these constraints reliably and you save significant tokens on the response side of the equation.

Real Token Savings Example

Here's what these tips look like combined for a real workflow — analyzing a quarterly business report:

| Approach | Prompt Tokens | Response Tokens | Total | |----------|-------------|----------------|-------| | Raw paste, verbose prompt | 16,400 | 1,200 | 17,600 | | Markdown + lean prompt | 5,800 | 680 | 6,480 | | Savings | | | 63% |

The Markdown conversion alone accounts for most of that saving. The prompt optimization takes it further.

Start With the Biggest Win

If you implement one thing from this guide, make it Markdown conversion. Convert your documents at inktomd.com before pasting them into ChatGPT — 24 formats supported, completely free, files never stored.

The other tips layer on top of that foundation. Together they can easily reduce your token usage by 60–70% across typical research and document analysis workflows.

Convert your documents to AI-ready Markdown →