Hi, I’m Thomas Verhoeven, and I started working for ACA IT-Solutions on the 2nd of March 2020. It’s my first job! As a junior...
Our highlights from KubeCon + CloudNativeCon Europe 2020Ugur Akkar and Abdulkadir Yavuz
KubeCon Europe is a yearly conference about anything and everything that has to do with Kubernetes. Every year, dozens of speakers present their talks about interesting topics. From storage to container vulnerability scanning, KubeCon has a talk for everyone, whether you’re still a bit green behind the ears or a veteran. This year, KubeCon Europe 2020 was held as a completely digital event. Abdulkadir and Ugur, two of our junior Cloud Engineers, followed the event from start to finish. Read on to find out their highlights from KubeCon Europe 2020!
Our talk highlights of KubeCon Europe 2020
K3s is the lightweight version of Kubernetes, with a binary only 50MB in size. That’s why k3s is perfect for running on edge infrastructure, with its minimum requirements only being 1 CPU and just 1GB of RAM. In theory then, k3s could run perfectly on a Raspberry Pi 4! Additionally, k3s is really easy to set up and run, since you only need to run one binary on the master. Run the k3s agent on your nodes and ta-dah, you’ve got a node.
It’s also very easy to bake in apps along with k3s. Users can place manifest files and container images in a directory on the host and k3s will automatically spin up those workloads when booting up.
What we found most interesting though, is that applications don’t see the difference between k3s and k8s. That means it’s perfectly possible to build a development k3 cluster to test your apps on, while saving costs and resources. If it runs on k3, it’ll run on k8s.
At the time of writing this, k3s has yet to implement a built-in etcd like a fully-fledged Kubernetes cluster, but it’s something that’s coming soon. At this point, it’s not 100% production ready yet, but we’re excited to see which direction Rancher will take k3s next.
You can watch the full talk on k3s here.
Understanding Kubernetes networking
‘Understanding Kubernetes networking’ was a talk about how networking works within Kubernetes with Flannel and Calico, and the difference between these two. Jeff Poole, the presenter for this particular talk, demonstrated how he analyzed network packets between pods and nodes. To make sure this demo could be done with minimal resources he deploy some Docker containers who ‘think’ they are VMs using footloose. Routing between the pods occurs via the iptables on the nodes themselves.
Something interesting that’s usually hidden is the way nodes route the requests to pods. Apparently, via iptables and some basic math, there’s a 33.33% chance that the first pod is chosen. If it’s not the first one, there’s a 50% it’ll be the second pod and if it isn’t the second one, it’s guaranteed to be the third.
Jeff then compared 2 Container Network Interfaces (CNI): Flannel and Calico. Flannel grants every node a static IP subnet and uses VXLAN as encapsulation for the data packets. VXLAN encapsulates ethernet (layer 2) packets in UDP packets and sends these back out. This is what the entire process looks like chronologically:
ethernet packet → IP packet → UDP packet → VXLAN packet → Ethernet packet → IP packet → original data packet
Calico, on the other hand, dynamically grants IP subnets to nodes according to the nodes’ needs. Contrary to Flannel, Calico operates at a layer 3 level (IP). That means Calico users IP encapsulation (IP-in-IP). Because Calico operates at a layer 3 level, it doesn’t have any layer 2 knowledge. Therefore, you can only encapsulate IP packets with Calico.
Personally, we thought this talk was very interesting because it taught us a lot about networking within Kubernetes, and how different CNIs like Flannel and Calico handle networking. Despite us being at a junior level and the talk being rated at expert level, we had no trouble following along. Props to Jeff for explaining everything so clearly! You can watch Jeff’s entire talk here and follow a link to lab docs here.
How This Innocent Image Had a Party in My Cluster
These days, you have to scan on three levels to build and host secure apps: vulnerabilities, malware, and misconfigurations.
- Vulnerabilities: scanning for CVEs with a vulnerability scanner such as Trivy, which scans images and looks at which dependencies have CVEs according to vulnerability databases.
- Malware: scanning files and comparing the SHA hash against malware databases.
- Misconfigurations: hard-coded passwords, SSH keys, weak ciphers and so on.
But the question is, are you secure even after scanning for these three possible security threats? According to our speakers, Amir Jerbi and Itay Shakury, the answer is ‘no’, because of the existence of evasive malware. Evasive malware is malware that hides itself in upstream packages, called a ‘supply chain attack’ or ‘poisoning the well’.
This type of malware usually only activates itself when an image is already up and running in a cluster. That’s why it’s important to also scan images during runtime. Additionally, what could help avoid this problem is a ‘shift left responsibility’: the idea that security should be an integral part of development from early on in the process.
The kernel is the place to be to reliably spot malware, this is usually done with eBPF, a built-in tool in Linux. Tracee is an open-source tool built around eBPF to validate containers and make sure there isn’t any malicious software being executed.
This talk taught us a lot about how to spot malware in a container environment. It proved to be a good introduction to security with containers, and the speakers very eloquently explained a complex subject. The accompanying demo was educational and proved that just scanning an image before runtime is not enough to have secure infrastructure. Watch the full talk here.
Hey, Did You Hear About This New CVE?
In this talk, speakers Andrew and Alexandr explain how we should prepare for complex vulnerabilities, how we should combat them and we can solve any problems if anything goes wrong.
Andrew and Alexandr say to first start with a checklist to secure four important infrastructure components that hackers exploit the most.
Use as little privileged RBAC profiles as possible to limit the rights these profiles have within the cluster as much as possible. When setting up an RBAC profile, an audit log can be turned on to determine which API is not used and can be deleted. Additionally, implement different alerts for access denied requests to trace when access denied requests occurs.
It’s handy to construct a visual map of the network connectivity between pods, resources and services to only create the requests between these resources. It’s also handy and strongly recommended to switch off the hostNetwork parameter. This parameter ensures that your hostNetwork and Node network are the same, enabling hackers to find out the NodeIP.
Click here for a deeper dive
The user that runs a container should be a non-root user. Whenever a hacker is able to enter a container, he won’t be able to switch to a root user and thus have a minimal amount of impact on the container and infrastructure.
More info here
Use distroless images as much as possible. Distroless images contain only an application and its runtime dependencies. These images contain no packet managers, shells or other programs you might expect in a normal Linux distribution. With distroless images, you can create a safe environment for an application and render hackers unable to execute any actions.
But what if my image has vulnerabilities? Whenever a vulnerability or breach is detected, you’ll need to take action ASAP. Andrew and Alexandr noted these five actions:
- Reproduce: what is an attacker able to do, is the vulnerability valid, can I execute the attack myself, what are the symptoms?
- Communication: make a document and/or chatroom with the right people to document and discuss the issue.
- Ownership: who will fix the problem and relay the problem to the client?
- Analyze and fix: analyze the problem. Can you quickly implement a fix for the problem? If not, is there a short time fix that can temporarily fix the problem?
- After effects: look at logs, impacted users, root cause, when the breach happened, …
Security plays an important role in the IT world, especially today when different vulnerabilities often surface. Thanks to Andrew and Alexandr’s talk, we’ve learned which actions to undertake to minimize the impact on users and clients and quickly fix the problem in the event of a breach. Props to Andrew and Alexandr for their clear explanations and examples! You can watch the talk here.
Escaping the Jungle – Migration to Cloud Native CI/CD
In this talk, Anton Weiss explains which challenges organizations face when switching to modern software deliveries, such as the rebuilding of CI/CD processes and toolings based on Cloud Native concepts and toolings.
When companies are just a start-up, they consist of only a few employees, communications are fairly simple and smooth, there are only a few build jobs and so on. However, when those start-ups grow to be IT giants, several things change: expansion of infrastructure, multiple programming languages and frameworks, studying and maintaining databases, hundreds of CI jobs and so on. Therefore, Anton suggests migrating to a Cloud Native CI/CD.
Cloud Native CI/CD is easily integrated into k8s, meaning scaling different agents and environments is easy and without causing downtime. This means you can use release techniques such as blue-green or cannery deployment, after the application is built and tested. Anton suggests using AMF Pattern (Agent Model Flow Pattern). At the start of the Jenkins pipeline, an Agent is defined which will execute the build. Next, the Model is defined, which specifies what will be done: testing, building, … The Model also states in which programming language this’ll be done.
Lastly, the Flow is defined, which calls on the previously made Models and executes the accompanying script.
After successfully testing and building the application, it’s time for deployment. Anton recommends Helm Charts as an excellent tool to deploy the application and services to a k8s cluster. He also mentioned trying to use GitOps, which is still relatively new and can’t yet immediately be implemented in our CI/CD.
Personally, I found this an interesting topic, because migrating from CI / CD to the Cloud is not easy and it is not easy to decide. Anton presented clear points why we should migrate to Cloud CI / CD. Watch his entire talk here!
Uncharted Territories: Discovering Vulnerabilities in Public Helm Charts
Helm Charts are a big hit and are used more and more. But are they safe to use? In her talk, Hayley Denbraver explains Helm, its Charts and the security aspect of Helm Charts using public images. Hayley shared a plug-in in her talk to search for possible vulnerabilities in a public Helm Chart image. In her example, an illustration was also shown which shows that the most popular images have many vulnerabilities, with database images containing most of the vulnerabilities.
The most common vulnerabilities with their security grades are:
You can find more info on the top three vulnerabilities here:
The intent here is not to immediately stop using Helm Chart with images that contain vulnerabilities, but to examine the potential solutions. Use as many tools and plug-ins to scan for vulnerabilities, delete unnecessary code and add-ons and so on, so they don’t cause any security issues later. One of the most important points to take home is testing: test your code and test every possible scenario to eliminate the risk.
Personally, we were a bit shocked by this talk. Because we had no idea that the most popular public images had such security issues with so many vulnerabilities. On the other hand, it was a very educational talk with lots of information about different vulnerabilities in public images. It shows that you shouldn’t just trust a public image, even if it’s used by millions of developers! Watch the entire talk here.
Despite KubeCon Europe 2020 being KubeCon’s first fully digital conference and the short time in which the event was put together, everything was well-prepared and organized. At the start there were a couple of issues, such as the maximum stream quality of 840p making it sometimes difficult to read code or what was on a speaker’s slides. However, every KubeCon Europe 2020 talk is available for rewatching if you missed something, which shows just how highly-regarded knowledge sharing is in the KubeCon community!
As junior cloud engineers new to Kubernetes, we’ve learned so much that’ll help us advance our careers. We were also pleased to see that the Kubernetes community is one big family, with people eager to help each other out with answering questions, reviewing projects and so on.
KubeCon Europe 2020 has made us realize that Kubernetes and CloudNative are much broader than how we’re currently utilizing them. We can’t wait to improve further and see what the future holds!