How to troubleshoot ignored yaml file?

OriolAbril · March 2, 2020, 10:01pm

Description

I think that the yaml file is being ignored, and I’d like to have some guidance on how to find issues with yaml files in codecov.

Currently what I do is:

Use cat codecov.yml | curl --data-binary @- https://codecov.io/validate to validate my yaml file. It says it is valid.
Check the Settings section of the repository, to check that parsed version is equal to yaml file. They are completely different:

I suspect that there is some issue with parsing that is not caught by the validation check. However, I have not be able to access any traceback from parsing.

My question is then kind of double, first it would be great if someone could help and point me what I am missing in order to fix this. Secondly, it would be eve better to get some guidance (or be pointed to any resource) on how to troubleshoot this kind of issues by myself. I am open to help on this once I know how to do it.

Repository

Here is a PR where CI has run ONLY ONCE, and eve though codecov.yml says to wait until 5 builds were finished, the bot commented and edited as new builds were finishing.

github.com/arviz-devs/arviz

Bump minor version to 0.7.0

arviz-devs:master ← arviz-devs:bump_minor_version

opened 09:38PM - 02 Mar 20 UTC

OriolAbril

+354 -156

## Description Prepare release 0.7.0 ## Checklist - [x] Follows [offici…al](https://github.com/arviz-devs/arviz/blob/master/CONTRIBUTING.md#pull-request-checklist) PR format - [x] Code style correct (follows pylint and black guidelines) - [x] Changes are listed in [changelog](https://github.com/arviz-devs/arviz/blob/master/CHANGELOG.md#v0xx-unreleased)

Additional Information

Extra checks I have done:

make sure codecov.yml is in repo main directory
make sure yaml file is named codecov.yml and not .codecov.yml

drazisil · March 4, 2020, 2:43pm

Hi @OriolAbril

The validate endpoint is the best bet, but it’s not currently updated (should be very soon) and as such can not be fully trusted.

In the case of your b1316f41e3cfd34fe66196a98c1ca11d4a2745e6 commit, the YAML has the following error: after_n_builds: 5 is not a valid key under comment, it only works under notify (and affects all types of notify, so no need to get more detailed)

If things are still wrong after this fix, please share a commit SHA and I’ll check the logs again.

OriolAbril · March 4, 2020, 6:30pm

This seems to have fixed the issue with yaml file parsing, thanks!

I am not sure about what does “validate endpoint” means though, could you explain it or provide some link?

We now generally have another issue, I think it is related to Codecov status stuck at "waiting for status to be reported" on github (not completely sure though). I am still at loss as to how to troubleshoot though.

If you take extend make_ufunc and improve wrap_xarray_ufunc defaults by OriolAbril · Pull Request #1107 · arviz-devs/arviz · GitHub, Azure has uploaded coverage info from 5 builds (the 3 base tests and the 2 external tests) but only 3 seem to have been received, is this also a configuration issue?

Codecov page for latest commit in PR: https://codecov.io/gh/arviz-devs/arviz/commit/4c8e16e91c405091efe0b306f8c752387175657d/build

drazisil · March 4, 2020, 7:09pm

I’m referring to About the Codecov YAML

Let me check that commit and see what’s up.

Commit 4c8e16e91c405091efe0b306f8c752387175657d still has the after_n_builds under comment, did you provide the correct SHA?

OriolAbril · March 4, 2020, 9:18pm

I may have forgotten to rebase, but it is still puzzling, every time I understand less of what is happening.

I did the validation, and got Valid! with the after_n_builds in comment section. I have also seen it should be supported (not sure if already or in the near future: Release Notes for Codecov v4.4.9). Do you know if comment will default to value in notify or if both must be set for comment and check to wait?
commit 4c8e16e91c405091efe0b306f8c752387175657d may still have after_n_builds in codecov (due to forgotten rebase) but both comment and check are waiting! 3 builds have finished (according to codecov, 5 have actually finished), but no message whatsoever
How can builds finished hours ago still not appear in codecov? codecov page for commit 4c8e16e... still says " Notifications are pending CI completion. Waiting for GitHub’s status webhook to queue notifications."

Thanks for your patience.

drazisil · March 5, 2020, 2:12pm

Thank you for yours!

First, the validator is not 100% correct in it’s valid message, due to a tightening of the schema. It is a priority to correct and should be fixed soon.

The link you referenced is for our Enterprise solution which, while mostly the same, does have some differences due to the differ needs our Enterprise customers face. For codecov.io, https://docs.codecov.io/docs/codecovyml-reference is the best reference.

Let’s look at 4c8e16e91c405091efe0b306f8c752387175657d

There’s a couple issues here. First, you appear to still have after_n_builds in the behavion section, which i making the YAML fail the parser. Second, the fallback is , as you say, 3 out of 5 builds.

When I check the database for that commit I only see 3 uploads, which is why. Can you link me where you see that Codecov is saying we processed 5 so I can see what happened?

EDIT: I retract what I said about the after_n_builds. It’s not valid, but looks like it should be. Discussing this with engineering.

OriolAbril · March 5, 2020, 10:35pm

Ok, thanks for the clarification!

Sorry about the confusion, I try not to but still mix the docs for the two of them from time to time. Thanks for commenting on the docs suggestion too.

I’ll try to explain myself better. codecov website does not always receive the 5 builds, and I have no idea why. In the same PR, with exactly the same tests run on both commits:

4c8e16e91c405091efe0b306f8c752387175657d has 3 builds on codecov even though 5 builds finished on Azure
0a773ad91c4a3c15a8b7177cfc0a1fcc0830da53 instead received the 5 builds in codecov, here is the link to corresponding Azure build, it is difficult to distinguish the two Azure builds between them

Any idea as to why it works only sometimes?

Note: base tests and external tests upload to codecov, benchmarks does not, and the other two are not run for PRs.

drazisil · March 9, 2020, 5:14pm

Can you try the -n flag to give the uploads names and see if that helps us determine which ones are not completing the upload step? codecov-bash/codecov at master · codecov/codecov-bash · GitHub

OriolAbril · March 9, 2020, 9:03pm

I have opened a PR only for troubleshooting this so it won’t get merged because of the other changes.

I found an error message in one of the builds where not all reports were uploaded! It actually is in the base commit of the PR linked above, which is why we get a nearly 2% coverage increase by adding a name to the uploaded codecov builds. Would using --required flag solve the issue by failing CI build when codecov upload fails? Or is there a better way to tackle this problem?

I am copying the error message, for the full traceback and context see link:

Error: HTTPSConnectionPool(host='codecov.io', port=443): Max retries exceeded with url: /codecov/v4/raw/2020-03-06/2C016F20EC330FC563151DA316E11499/00c6d5c057944966e765399d1508181b89c1ce3d/8abc87f8-533f-42ca-b73d-fa182d6ef3f0.txt?AWSAccessKeyId=AKIAIHLZSCQCS4WIHD4A&Expires=1583500724&Signature=%2FGe0uHJMOuRL06J0nbZCI9uPdLs%3D (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fea47cab6d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))

drazisil · March 10, 2020, 2:16pm

That looks like the python uploader, which I was not aware you were using. I’m not sure offhand what the --required flag does, I’m more versed with the bash uploader.

But yes, if you need the job to fail if the Codecov upload fails you will need some sort of flag as it normally does not stop the build on failure.

OriolAbril · March 10, 2020, 9:49pm

I there any way to lower the number of times this error happens? I am not sure if this is an issue with azure sending the reports or with codecov receiving them. And if possible we’d like to avoid rerunning the whole job to reupload coverage to codecov.

We are using python uploader, would using the bash make any difference?

Note: Just to be extra clear with what I said above, we don’t really know at all the reason why the uploads are failing nor have any idea about how to investigate the reason, so any help on this direction is greatly appreciated too.

drazisil · March 13, 2020, 3:23pm

If you are using the bash uploader, the -v flag will output a lot of information, including the message of any network upload errors. That would probably help.

OriolAbril · March 13, 2020, 7:19pm

We set -v flag, and we can’t see anything to help us tackle the upload error , we tried 5 uploads waiting 90 seconds between upload and all 5 failed:

comit sha: 89d56251eb3cf360433ca4e19405c469b88bf365, you can check that 4 out of 5 builds uploaded correctly.

drazisil · March 16, 2020, 1:09pm

It looks like you are still using the python uploader, can you you try the bash uploader with -v, please?

OriolAbril · March 16, 2020, 3:18pm

Changed to using the bash uploader, there is no comparison between them! The bash uploader retries automatically to reupload results when there is a connexion error, the verbose flag does print useful info, thanks for the pointer!

Is there somewhere in the docs discouraging the use of the Python uploader? Maybe there should?

Here is the commit using the bash uploader: 419e9242fecbd94b8a8d644238f6eda5f18523cf, and here is one build that could not upload on the first try but eventually succeeded: https://dev.azure.com/ArviZ/ArviZ/_build/results?buildId=2135&view=logs&j=e6a7683b-6131-58a8-ef68-5f3a9120796c&t=0a472ee5-4a3b-5581-ec9e-6294371ddc1c&l=14

So it looks like using the bash uploader should solve the upload failure issue in most cases

I have also tried to upload the coverage from a fork PR (it has no access to secrets and therefore no token) because it looks like public Azure Pipelines projects should not need a token:

If you have a public project on TravisCI, CircleCI, AppVeyor, Azure Pipelines, or GitHub Actions an upload token is not required.

Our project is https://dev.azure.com/ArviZ/ArviZ which is public (I double checked in project settings that visibility is public), however, we got the following message:

Commit sha does not match Azure build. Please upload with the Codecov repository upload token to resolve issue.

Here is the commit 5ec1c2c3802382d4be2847053ebb1e57ef0c76cb (note that nothing was uploaded to codecov for this commit, I am not sure it will be of any help) and the link to the build (where the output of codecov bash uploader can be read): https://dev.azure.com/ArviZ/ArviZ/_build/results?buildId=2137&view=logs&j=e4994452-efbb-51bd-c9fc-f1d030f5bbfb&t=b874d77c-8497-5843-d5d5-e253609482a4&l=14

drazisil · March 16, 2020, 3:49pm

I’m fairly sure you are hitting CodeCov Uploads from Azure Pipelines are failing with 'Build numbers do not match' - #10 by X-Guardian (where the Azure API returns an different SHA on merge commits.

I would see if you can confirm, then watch that thread.

Not yet. We are trying to reduce the usage of all the uploaders to a single one, but it’s a slow process and we are trying to make it as least disruptive as possible.

OriolAbril · March 16, 2020, 6:08pm

It looks like it is the same issue, thanks, I tried searching for this error message but somehow missed the thread, thanks again.

I’ll update the steps to troubleshoot codecov issue to take this into account:

Use cat codecov.yml | curl --data-binary @- https://codecov.io/validate to validate my yaml file. It says it is valid.
Make sure to use bash uploader, set -v flag to debug. Check the parsed yaml (if using v flag) tracks.
Check the Settings section of the repository, to check that parsed version is equal to yaml file.

Thank you so much for your help!

drazisil · March 17, 2020, 12:21pm

Hey @OriolAbril, following up on this. Both exist now, and here is how they work:

notify.after_n_builds stops the whole notification flow, which includes comments. 
comment.after_n_builds stops only comments.

Thank you for your patience and persistence in pushing us to locate and resolve this bug. I’ll close out your docs thread as well, but if you want to make an edit to make it clearer, please do so.

OriolAbril · March 17, 2020, 12:40pm

Great! Thanks for all the help!

Topic		Replies	Views
Reports are queued for processing... Please review report with caution, it may change Support	21	777	November 22, 2021
CodeCov Uploads from Azure Pipelines are failing with 'Build numbers do not match' Bug Fixes	43	2597	January 5, 2021
Codecov cannot analyze Cobertura reports from AppVeyor Support	25	2218	May 4, 2020
Codecov.yml not updating Support	4	593	May 1, 2020
Codecov.yml does not update in our repo and does not comment in Gitlab PR Support	5	817	December 17, 2021

How to troubleshoot ignored yaml file?

Description

Repository

Additional Information

Related topics