Development guidelines

The CKI Project is a complicated set of subprojects that interact between each other and changes in one code base impact other projects in sometimes unexpected ways.

The project started as a proof of concept and prototypes were deployed to production. Now, a few years later, the project is well established and we should think more about the code we write. This includes both ongoing development and elimination of technical debt we introduced before.

I want to submit a change

Awesome! Here's a handy checklist to go through:

Use forks

Even if you have permissions to push to the upstream repo, fork it and do the development in your fork. The upstream repositories really should contain only branches that are relevant for everyone (such as releases and branches to base your development upon) and not private code.

Test your changes

Did you add tests for the code you added?

Did the CI pass? If you modified the data verify the data is correct instead of just relying on the result.

Did you run any repo-specific testing (e.g. verification script for reporter changes)? Say what you did / what commands you used so the reviewer can reproduce your testing and verify the results are indeed correct.

Use linters

You are working in a team of people. Your code will be read and modified and should be easy to understand and debug. Strive to write code that is friendly to others and document it well.

Define variables instead of having a chained call/access that takes up the whole line. The long call is harder to follow and doesn't include any error handling.

Humans are visual types! Line breaks and visual indentation at the right places make the code much faster to understand. Here's a small example:

# Bad: The function call is unintuitively split and it's not obvious where the
# calls end
new_jobs = sorted(
    [x for x in jobs if x['stage'] in [last_stage_run, get_first_stage(x,
        pipeline)]])

# Good: It's clear which parts of the logic belong together
new_jobs = sorted(
    [x for x in jobs
     if x['stage'] in [last_stage_run, get_first_stage(x, pipeline)]]
)

# Good: It's clear which parts of the logic belong together
new_jobs = sorted(
    [x for x in jobs if x['stage'] in [last_stage_run,
                                       get_first_stage(x, pipeline)]]
)

Follow the linters and good practices for the language you use! They point out patterns that cause bugs or make the code harder to deal with. Try to actually understand what the linter is saying instead of blindly making your code pass, making it even less understandable in the process. Sometimes it's reasonable to ignore linters' warnings - e.g. ignoring the "line too long" messages for long URLs. If you have to disable a warning only do it for the specific line(s) instead of the whole project. Some good linters to check out are pycodestyle, pyflakes, pylint or shellcheck.

Document, document, document

How to run your changes? Did the interface change? Make sure the user guide of the project is updated together with your code changes!

Is the interface you're using quirky? Did you add a workaround for a specific bug? If it's something that's not immediately obvious unless you're familiar with the situation, chances are other people running into that code will be equally confused. This is when docstrings with explanation in the code are important.

Explain why is the change needed in the commit message.

If possible, find the root cause

It's easy to work around a symptom instead of fixing the core problem. As our projects interact a lot, the root cause may not be in the place which failed. E.g. kernel publishing in the pipeline failed with a missing variable in RC file and reporter exploded because of that. However, the root cause is the GitLab runner bug which caused build to be killed and marked as successful. In this case, patching the publishing or reporter would just hide the problem and make it harder to debug in the future.

If you're proposing a bug fix, make sure you fetched down the root cause and fixed that and aren't just adding a workaround for a symptom. If you (or anyone around you) can't find the root cause right away and the project needs to be fixed ASAP, add information about the problem to the code base with the workaround and open a ticket for it so it's not forgotten about.

I want to review a change

Here's a tiny checklist for you as well!

  • Verify the changes are actually tested
  • Passing CI is a good sign but if no unit tests were added the code is not tested.
  • Verify the data instead of just relying on the green check mark. E.g. if the change modifies RC data, check the data is actually changed correctly in the CI pipelines.
  • Do NOT merge changes without tests unless it's an absolute emergency!
  • If you are the maintainer or were chosen to be a reviewer, you probably have a good knowledge of the project in question.
  • Is there a simpler or more robust way to reach the end goal?
  • Is a related change to a different repository needed because the interface was modified?
  • Make sure the new changes are easy to understand and follow.

Help, the change was merged without proper review or testing

REVERT IT! Seriously. We had a MR with a single line of code merged without testing because it's so trivial and all pipelines were suddenly crashing because a variable it used wasn't defined yet. And that MR wasn't even fixing the problem it claimed in the first place.

If there is a bug in the merged code, rushing to fix it may just cause us to overlook another problem and introduce more issues instead of fixing it. Even if the code doesn't have any obvious bugs people would be afraid to contribute to it as it would be hard to find out if anything broke if there are no tests.

Sometimes bugs happen and that's fine. It doesn't mean that we shouldn't strive for code that's easy to maintain, understand, debug and is well tested.

Logging

For CLI tools, use print() to stdout to communicate with the user. Output to stderr is reserved for logging via Python's logging framework.

To get a logger, use cki-lib's cki_lib.logger.get_logger(__name__). For modules that can be called as an executable or via python -m, you have to use a literal name, e.g. cki.webhook.public for the public webhook main module. For the logging levels, use

  • WARNING for exceptional stuff about which a CLI user needs to be notified; nothing should be logged with this level during normal operation.
  • INFO for progress information that shouldn't be shown to a CLI user by default, but that is useful in a container to see what is going on, e.g. info about requests that are handled/rejected.
  • DEBUG for information that is only useful for a developer to figure out what is going on in the case of a bug. This will not be shown in a container by default.