I suspect that there is some issue with parsing that is not caught by the validation check. However, I have not be able to access any traceback from parsing.
My question is then kind of double, first it would be great if someone could help and point me what I am missing in order to fix this. Secondly, it would be eve better to get some guidance (or be pointed to any resource) on how to troubleshoot this kind of issues by myself. I am open to help on this once I know how to do it.
Repository
Here is a PR where CI has run ONLY ONCE, and eve though codecov.yml says to wait until 5 builds were finished, the bot commented and edited as new builds were finishing.
Additional Information
Extra checks I have done:
make sure codecov.yml is in repo main directory
make sure yaml file is named codecov.yml and not .codecov.yml
The validate endpoint is the best bet, but it’s not currently updated (should be very soon) and as such can not be fully trusted.
In the case of your b1316f41e3cfd34fe66196a98c1ca11d4a2745e6 commit, the YAML has the following error: after_n_builds: 5 is not a valid key under comment, it only works under notify (and affects all types of notify, so no need to get more detailed)
If things are still wrong after this fix, please share a commit SHA and I’ll check the logs again.
I may have forgotten to rebase, but it is still puzzling, every time I understand less of what is happening.
I did the validation, and got Valid!with the after_n_builds in comment section. I have also seen it should be supported (not sure if already or in the near future: Release Notes for Codecov v4.4.9). Do you know if comment will default to value in notify or if both must be set for comment and check to wait?
commit 4c8e16e91c405091efe0b306f8c752387175657d may still have after_n_builds in codecov (due to forgotten rebase) but both comment and check are waiting! 3 builds have finished (according to codecov, 5 have actually finished), but no message whatsoever
How can builds finished hours ago still not appear in codecov? codecov page for commit 4c8e16e... still says " Notifications are pending CI completion. Waiting for GitHub’s status webhook to queue notifications."
First, the validator is not 100% correct in it’s valid message, due to a tightening of the schema. It is a priority to correct and should be fixed soon.
The link you referenced is for our Enterprise solution which, while mostly the same, does have some differences due to the differ needs our Enterprise customers face. For codecov.io, https://docs.codecov.io/docs/codecovyml-reference is the best reference.
Let’s look at 4c8e16e91c405091efe0b306f8c752387175657d
There’s a couple issues here. First, you appear to still have after_n_builds in the behavion section, which i making the YAML fail the parser. Second, the fallback is , as you say, 3 out of 5 builds.
When I check the database for that commit I only see 3 uploads, which is why. Can you link me where you see that Codecov is saying we processed 5 so I can see what happened?
EDIT: I retract what I said about the after_n_builds. It’s not valid, but looks like it should be. Discussing this with engineering.
Sorry about the confusion, I try not to but still mix the docs for the two of them from time to time. Thanks for commenting on the docs suggestion too.
I’ll try to explain myself better. codecov website does not always receive the 5 builds, and I have no idea why. In the same PR, with exactly the same tests run on both commits:
0a773ad91c4a3c15a8b7177cfc0a1fcc0830da53 instead received the 5 builds in codecov, here is the link to corresponding Azure build, it is difficult to distinguish the two Azure builds between them
Any idea as to why it works only sometimes?
Note: base tests and external tests upload to codecov, benchmarks does not, and the other two are not run for PRs.
I have opened a PR only for troubleshooting this so it won’t get merged because of the other changes.
I found an error message in one of the builds where not all reports were uploaded! It actually is in the base commit of the PR linked above, which is why we get a nearly 2% coverage increase by adding a name to the uploaded codecov builds. Would using --required flag solve the issue by failing CI build when codecov upload fails? Or is there a better way to tackle this problem?
I am copying the error message, for the full traceback and context see link:
Error: HTTPSConnectionPool(host='codecov.io', port=443): Max retries exceeded with url: /codecov/v4/raw/2020-03-06/2C016F20EC330FC563151DA316E11499/00c6d5c057944966e765399d1508181b89c1ce3d/8abc87f8-533f-42ca-b73d-fa182d6ef3f0.txt?AWSAccessKeyId=AKIAIHLZSCQCS4WIHD4A&Expires=1583500724&Signature=%2FGe0uHJMOuRL06J0nbZCI9uPdLs%3D (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fea47cab6d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
That looks like the python uploader, which I was not aware you were using. I’m not sure offhand what the --required flag does, I’m more versed with the bash uploader.
But yes, if you need the job to fail if the Codecov upload fails you will need some sort of flag as it normally does not stop the build on failure.
I there any way to lower the number of times this error happens? I am not sure if this is an issue with azure sending the reports or with codecov receiving them. And if possible we’d like to avoid rerunning the whole job to reupload coverage to codecov.
We are using python uploader, would using the bash make any difference?
Note: Just to be extra clear with what I said above, we don’t really know at all the reason why the uploads are failing nor have any idea about how to investigate the reason, so any help on this direction is greatly appreciated too.
If you are using the bash uploader, the -v flag will output a lot of information, including the message of any network upload errors. That would probably help.
Changed to using the bash uploader, there is no comparison between them! The bash uploader retries automatically to reupload results when there is a connexion error, the verbose flag does print useful info, thanks for the pointer!
Is there somewhere in the docs discouraging the use of the Python uploader? Maybe there should?
If you have a public project on TravisCI, CircleCI, AppVeyor, Azure Pipelines, or GitHub Actions an upload token is not required.
Our project is https://dev.azure.com/ArviZ/ArviZ which is public (I double checked in project settings that visibility is public), however, we got the following message:
Commit sha does not match Azure build. Please upload with the Codecov repository upload token to resolve issue.
I would see if you can confirm, then watch that thread.
Not yet. We are trying to reduce the usage of all the uploaders to a single one, but it’s a slow process and we are trying to make it as least disruptive as possible.
Hey @OriolAbril, following up on this. Both exist now, and here is how they work:
notify.after_n_builds stops the whole notification flow, which includes comments.
comment.after_n_builds stops only comments.
Thank you for your patience and persistence in pushing us to locate and resolve this bug. I’ll close out your docs thread as well, but if you want to make an edit to make it clearer, please do so.