Guthub Copilot - Notes: Prompt Engineering
- Radek Stolarczyk
- 6 hours ago
- 3 min read
What is Prompt Engineering:
Process of writing clear instructions for AI (Copilot)
Ensures generated code is correct, relevant, and useful
4 Core Principles (4 Ss):
Single → focus on one task
Specific → give clear, detailed instructions
Short → keep prompts concise
Surround → provide context (files, comments, open tabs)
Best Practices:
Be clear and explicit
Provide context using comments or related files
Use examples to guide output
Iterate:
Refine prompt instead of rewriting code manually
Improve results step by step
How Copilot Learns from Prompts:
Zero-shot:
No examples
Uses general knowledge
One-shot:
One example provided
Improves consistency
Few-shot:
Multiple examples
Handles complex scenarios better
Advanced Techniques:
Chain Prompting:
Build solutions step by step
Manage long conversations by summarizing or resetting
Role Prompting:
Ask Copilot to act as an expert
Examples:
Security expert → secure code
Performance expert → optimized code
Testing expert → better test cases
GitHub Copilot Prompt Process Flow
Overall Flow:
Copilot processes prompts in two stages:
Inbound (input processing)
Outbound (response generation)
Inbound Flow:
Secure Transmission & Context Gathering
Prompt sent securely via HTTPS
Copilot collects context:
Code around cursor
File name and type
Open tabs and project structure
Language and frameworks
Uses Fill-in-the-Middle (FIM) → understands before + after code
Proxy Filter
Runs through secure proxy (Azure)
Blocks malicious or manipulated prompts
Toxicity Filtering
Removes:
Harmful or offensive content
Personal/sensitive data
Code Generation (LLM)
AI model generates code
Based on prompt + context
Outbound Flow:
Post-processing & Validation
Filters output for:
Code quality (bugs, vulnerabilities)
Security issues (e.g., XSS, SQL injection)
Optional: blocks suggestions similar to public code
Suggestion Delivery & Feedback Loop
Safe responses shown to user
Copilot learns from:
Accepted suggestions
Edits and rejections
Continuous Improvement
Process repeats for each prompt
Improves over time using feedback and context
Exam Tip Summary:
Flow = Input → Filtering → Generation → Output → Feedback
Key concepts:
Context gathering
Proxy + toxicity filters
LLM code generation
Post-validation
Feedback loop
FIM = uses code before and after cursor for better results
GitHub Copilot Data Handling
Code Suggestions Data Handling:
Prompts (code, context) are not stored long-term
Data is discarded after suggestion is generated
Does not train foundational models from your code
Users can opt out of data sharing (for model improvement)
Copilot Chat Data Handling:
Chat is interactive and conversational
Maintains conversation history for context
Data Retention:
Chat data may be stored for up to 28 days
Applies to:
Chat in IDE
CLI
Mobile
Chat Capabilities (Prompt Types):
Direct questions → concepts, errors
Code requests → generate, fix, explain code
Open-ended → best practices, optimization
Contextual → based on your code/project
Context Window:
Limited amount of code/context Copilot can process
Typically:
~200–500 lines of code
Copilot Chat: ~4k tokens
Limitation impact:
Large inputs may lose context
Best Practice:
Break problems into smaller parts
Provide relevant code snippets only
Keep prompts focused for better results
Exam Tip Summary:
Code suggestions → not stored, not used for training
Chat → may retain data (~28 days)
Supports multiple prompt types
Context window is limited → keep prompts concise and focused
GitHub Copilot LLMs
What are LLMs:
Large Language Models = AI models trained on large amounts of data
Capable of understanding and generating human language
Key Characteristics:
Trained on massive datasets
Strong contextual understanding
Based on neural networks (ML/AI)
Highly versatile across tasks and domains
Role of LLMs in Copilot:
Generate context-aware code suggestions
Use:
Current file
Open tabs
Project context
Improve accuracy and developer productivity
Fine-Tuning:
Process of adapting pretrained models for specific tasks
Uses smaller, task-specific datasets
Enhances performance for targeted use cases
LoRA Fine-Tuning (Low-Rank Adaptation):
Efficient alternative to full fine-tuning
Adds small trainable components instead of retraining entire model
Original model remains unchanged
Benefits:
Faster and more resource-efficient
High performance with less computation
Better than traditional methods (adapters, prefix-tuning)
Exam Tip Summary:
LLMs = core engine behind Copilot
Provide context-aware code generation
Fine-tuning = adapts model to tasks
LoRA = efficient fine-tuning method (key concept)