How to access kubernetes dashboard from outside cluster

Edit kubernetes-dashboard service.

$ kubectl -n kube-system edit service kubernetes-dashboard

You should see yaml representation of the service. Change type: ClusterIP to type: NodePort and save file. If it’s already changed go to next step.

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
...
  name: kubernetes-dashboard
  namespace: kube-system
  resourceVersion: "343478"
  selfLink: /api/v1/namespaces/kube-system/services/kubernetes-dashboard-head
  uid: 8e48f478-993d-11e7-87e0-901b0e532516
spec:
  clusterIP: 10.100.124.90
  externalTrafficPolicy: Cluster
  ports:
  - port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

Next we need to check port on which Dashboard was exposed.

$ kubectl -n kube-system get service kubernetes-dashboard
NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes-dashboard   10.100.124.90   <nodes>       443:31707/TCP   21h

Dashboard has been exposed on port 31707 (HTTPS). Now you can access it from your browser at:
]https://master-ip:31707]

master-ip can be found by executing kubectl cluster-info. Usually it is either 127.0.0.1 or IP of your machine, assuming that your cluster is running directly on the machine, on which these commands are executed.

In case you are trying to expose Dashboard using NodePort on a multi-node cluster, then you have to find out IP of the node on which Dashboard is running to access it. Instead of accessinghttps://master-ip:nodePort

you should access https://node-ip:nodePort.

If the dashboard is still not accessible execute the below:

sudo iptables -P FORWARD ACCEPT

Advertisements

Prometheus pod consuming a lot of memory

I can use ‘–storage.local.memory-chunks’ to limit the memory usage when prometheus 1.X.

 

Prometheus 2.0 uses the OS page cache for data.  It will only use as much memory as it needs to operate. The good news is the memory use is far efficient than 1.x.  The amount needed to collect more data is minimal.

There is some extra memory needed if you do a large amount of queries, or queries that require a large amount of data.
You will want to monitor the memory use of the Prometheus process (process_resident_memory_bytes) and how much page cache the node has left (node_exporter, node_memory_Cached).

Continue reading Prometheus pod consuming a lot of memory

Prometheus giving error for alert rules ConfigMap in kubernetes

When creating a configMap or prometheus alert rules it gives an error as follows:
“rule manager” msg=”loading groups failed” err=”yaml: unmarshal errors:\n line 3: field rules not found in type rulefmt.RuleGroups”

The correct format for the rules to be added is as follows:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-alert-rules
data:
  alert.rules: |-
    groups:
    - name: example
      rules:
      - alert: Lots_Of_Billing_Jobs_In_Queue
        expr: sum (container_memory_working_set_bytes{id="/",kubernetes_io_hostname=~"(.*)"}) / sum (machine_memory_bytes{kubernetes_io_hostname=~"(.*)"}) * 100 > 40
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: container memory high

Can’t chown /usr/local in High Sierra

/usr/local can no longer be chown’d in High Sierra. Instead use

sudo chown -R $(whoami) $(brew --prefix)/*

When executing jobs in Jenkins pipeline and a job is failed at night how do you resume it in the same night?

When executing selenium jobs in Jenkins pipeline and a job is failed at night how do you resume it in the same night?

Solution:

You can use Naginator plugin to achieve your intended behavior. Configure it as follows:

Install the plugin -> check the Post-Build action “Retry build
after failure” on your project’s configuration page.

If the build fails, it will be rescheduled to run again after the time you specified. You can choose how many times to retry running the job. For each consecutive unsuccessful build, you can choose to extend the waiting period.

AWS – Does Elastic Load Balancing actually prevent LOAD BALANCER failover?

I’ve taken this straight from some AWS documentation:

“As traffic to your application changes over time, Elastic Load Balancing scales your load balancer and updates the DNS entry. Note that the DNS entry also specifies the time-to-live (TTL) as 60 seconds, which ensures that the IP addresses can be remapped quickly in response to changing traffic.”

Two questions:

1) I was under the impression originally that a single static IP address would be mapped to multiple instances of an AWS load balancer, thereby causing fault tolerance on the balancer level, if for instance one machine crashed for whatever reason, the static IP address registered to my domain name would simply be dynamically ‘moved’ to another balancer instance and continue serving requests. Is this wrong? Based on the quote above from AWS, it seems that the only magic happening here is that AWS’s DNS servers hold multiple A records for your AWS registered domain name, and after 60 seconds of no connection from the client, the TTL expires and Amazon’s DNS entry is updated to only start sending requests to active IP’s. This still takes 60 seconds on the client side of failed connection. True or false? And why?

2) If the above is true, would it be functionally equivalent if I were using a host provider of say, GoDaddy, entered multiple “A” name records, and set the TTL to 60 seconds?

Thanks!

Solution:

The ELB is assigned a DNS name which you can then assign to an A record as an alias, see here. If you have your ELB set up with multiple instances you define the health check. You can determine what path is checked, how often, and how many failures indicate an instance is down (for example check / every 10s with a 5s timeout and if it fails 2 times consider it unhealthy. When an instance becomes unhealthy all the remaining instances still serve requests just fine without delay. If the instance returns to a healthy state (for example its passes 2 checks in a row) then it returns as a healthy host in the load balancer.

What the quote is referring to is the load balancer itself. In the event it has an issue or an AZ becomes unavailable its describing what happens with the underlying ELB DNS record, not the alias record you assign to it.

Whether or not traffic is effected is partially dependent on how sessions are handled by your setup. Whether they are sticky or handled by another system like elasticache or your database.

AWS S3 Can't do anything with one file

I’m having issues trying to remove a file from my s3 bucket with the following name: &#x8;Patrick bla bla 1 PV@05-06-2018-19:42:01.jpg

If I try to rename it through the s3 console, it just says that the operation failed. If I try to delete it, the operation will “succeed” but the file will still be there.

I’ve tried removing it through the aws cli, when listing the object I get this back

 {
        "LastModified": "2018-06-05T18:42:05.000Z",
        "ETag": "\"b67gcb5f8166cab8145157aa565602ab\"",
        "StorageClass": "STANDARD",
        "Key": "test/\bPatrick bla bla 1 PV@05-06-2018-19:42:01.jpg",
        "Owner": {
            "DisplayName": "dev",
            "ID": "bd65671179435c59d01dcdeag231786bbf6088cb1ca4881adf3f5e17ea7e0d68"
        },
        "Size": 1247277
    },

But if I try to delete or head it, the cli won’t find it.

s3api head-object --bucket mybucket --key "test/\bPatrick bla bla 1 PV@05-06-2018-20:09:37.jpg"

An error occurred (404) when calling the HeadObject operation: Not Found

Is there any way to remove, rename or just move this image from the folder?

Regards

Solution:

It looks like your object’s key begins with a backspace (\b) character. I’m sure there is a way to manage this using the awscli but I haven’t worked out what it is yet.

Here’s a Python script that works for me:

import boto3 
s3 = boto3.client('s3')
Bucket ='avondhupress'
Key='test/\bPatrick bla bla 1 PV@05-06-2018-19:42:01.jpg'
s3.delete_object(Bucket=bucket, Key=key)

Or the equivalent in node.js:

const aws = require('aws-sdk');
const s3 = new aws.S3({ region: 'us-east-1', signatureVersion: 'v4' });

const params = {
  Bucket: 'avondhupress',
  Key: '\bPatrick bla bla 1 PV@05-06-2018-19:42:01.jpg',
};

s3.deleteObject(params, (err, data) => {
  if (err) console.error(err, err.stack);
});