Skip to content

bcftools 1.10 fails to index files with wrong INFO/END values but bcftools 1.9 worked fine #1154

@freeseek

Description

@freeseek

The following code reproduces the error:

bcftools view --no-version -G -r X:4380006-4461913 -i 'ID="rs6640136" || ID="rs57675010"' -Oz -o output.vcf.gz \
  http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/GRCh38_positions/ALL.chrX_GRCh38.genotypes.20170504.vcf.gz
bcftools index -ft output.vcf.gz

The problem here is that the 1000 Genomes project team made a mess and forgot to liftOver the END coordinates of the VCF. I am not sure whether this counts as a bcftools bug. But the VCF specification does not really call for the END field to always be bigger than the POS field, which is what is causing the problem here. I do find odd that tabix and previous versions of bcftools worked fine with these odd cases though.

If this is considered appropriate bcftools behavior, then I think that to a minimum bcftools norm should also check than END is always larger than POS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions