top of page

Guthub Copilot - Notes: Prompt Engineering

What is Prompt Engineering:

  • Process of writing clear instructions for AI (Copilot)

  • Ensures generated code is correct, relevant, and useful


4 Core Principles (4 Ss):

  • Single → focus on one task

  • Specific → give clear, detailed instructions

  • Short → keep prompts concise

  • Surround → provide context (files, comments, open tabs)


Best Practices:

  • Be clear and explicit

  • Provide context using comments or related files

  • Use examples to guide output

  • Iterate:

    • Refine prompt instead of rewriting code manually

    • Improve results step by step


How Copilot Learns from Prompts:


Zero-shot:

  • No examples

  • Uses general knowledge

One-shot:

  • One example provided

  • Improves consistency

Few-shot:

  • Multiple examples

  • Handles complex scenarios better


Advanced Techniques:


Chain Prompting:

  • Build solutions step by step

  • Manage long conversations by summarizing or resetting

Role Prompting:

  • Ask Copilot to act as an expert

Examples:

  • Security expert → secure code

  • Performance expert → optimized code

  • Testing expert → better test cases

GitHub Copilot Prompt Process Flow


Overall Flow:

  • Copilot processes prompts in two stages:

    • Inbound (input processing)

    • Outbound (response generation)


Inbound Flow:


  1. Secure Transmission & Context Gathering

    • Prompt sent securely via HTTPS

    • Copilot collects context:

      • Code around cursor

      • File name and type

      • Open tabs and project structure

        • Language and frameworks

    • Uses Fill-in-the-Middle (FIM) → understands before + after code

  2. Proxy Filter

    • Runs through secure proxy (Azure)

    • Blocks malicious or manipulated prompts

  3. Toxicity Filtering

    • Removes:

      • Harmful or offensive content

      • Personal/sensitive data

  4. Code Generation (LLM)

    • AI model generates code

    • Based on prompt + context


Outbound Flow:


  1. Post-processing & Validation

    • Filters output for:

      • Code quality (bugs, vulnerabilities)

      • Security issues (e.g., XSS, SQL injection)

    • Optional: blocks suggestions similar to public code

  2. Suggestion Delivery & Feedback Loop

    • Safe responses shown to user

    • Copilot learns from:

      • Accepted suggestions

      • Edits and rejections

  3. Continuous Improvement

    • Process repeats for each prompt

    • Improves over time using feedback and context


Exam Tip Summary:

  • Flow = Input → Filtering → Generation → Output → Feedback

  • Key concepts:

    • Context gathering

    • Proxy + toxicity filters

    • LLM code generation

    • Post-validation

    • Feedback loop

  • FIM = uses code before and after cursor for better results

GitHub Copilot Data Handling


Code Suggestions Data Handling:

  • Prompts (code, context) are not stored long-term

  • Data is discarded after suggestion is generated

  • Does not train foundational models from your code

  • Users can opt out of data sharing (for model improvement)


Copilot Chat Data Handling:

  • Chat is interactive and conversational

  • Maintains conversation history for context


Data Retention:

  • Chat data may be stored for up to 28 days

  • Applies to:


Chat Capabilities (Prompt Types):

  • Direct questions → concepts, errors

  • Code requests → generate, fix, explain code

  • Open-ended → best practices, optimization

  • Contextual → based on your code/project


Context Window:

  • Limited amount of code/context Copilot can process

  • Typically:

    • ~200–500 lines of code

    • Copilot Chat: ~4k tokens

  • Limitation impact:

    • Large inputs may lose context


Best Practice:

  • Break problems into smaller parts

  • Provide relevant code snippets only

  • Keep prompts focused for better results


Exam Tip Summary:

  • Code suggestions → not stored, not used for training

  • Chat → may retain data (~28 days)

  • Supports multiple prompt types

  • Context window is limited → keep prompts concise and focused

GitHub Copilot LLMs


What are LLMs:

  • Large Language Models = AI models trained on large amounts of data

  • Capable of understanding and generating human language


Key Characteristics:

  • Trained on massive datasets

  • Strong contextual understanding

  • Based on neural networks (ML/AI)

  • Highly versatile across tasks and domains


Role of LLMs in Copilot:

  • Generate context-aware code suggestions

  • Use:

    • Current file

    • Open tabs

    • Project context

  • Improve accuracy and developer productivity


Fine-Tuning:

  • Process of adapting pretrained models for specific tasks

  • Uses smaller, task-specific datasets

  • Enhances performance for targeted use cases


LoRA Fine-Tuning (Low-Rank Adaptation):

  • Efficient alternative to full fine-tuning

  • Adds small trainable components instead of retraining entire model

  • Original model remains unchanged


Benefits:

  • Faster and more resource-efficient

  • High performance with less computation

  • Better than traditional methods (adapters, prefix-tuning)


Exam Tip Summary:

  • LLMs = core engine behind Copilot

  • Provide context-aware code generation

  • Fine-tuning = adapts model to tasks

  • LoRA = efficient fine-tuning method (key concept)

 
 
bottom of page