[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: OpenShift and "export IPCFG="ip=<ip>::<gateway>:<netmask>:<hostname>:<iface>:none nameserver=srv1 [nameserver=srv2 [nameserver=srv3 [...]]]""



Hey Jorge,

Hmm, not yet when using the UPI method.  Quite certain I have the worker config set in guestinfo.ignition.config.data correctly, yet I receive the below:

==> bootstrap-kube-scheduler-rhbs01.osc01.nix.mds.xyz_kube-system_kube-scheduler-e231dfd09bf195f87a1a786c27a367df864879c5aa597d5865cfa9c839a379df.log <==
2021-04-20T04:16:30.944697032+00:00 stderr F E0420 04:16:30.944574       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://localhost:6443/api/v1/pods?fieldSelector=status.phase%21%3DSucceeded%2Cstatus.phase%21%3DFailed&limit=500&resourceVersion=0": dial tcp [::1]:6443: connect: connection refused

==> bootstrap-machine-config-operator-rhbs01.osc01.nix.mds.xyz_default_machine-config-server-fa8d62d97454781fa9f6b12daf42a484a058873d7a630a17cfd73eca7220fa81.log <==
2021-04-20T04:16:32.281495982+00:00 stderr F I0420 04:16:32.281248       1 api.go:117] Pool worker requested by address:"192.168.0.196:38722" User-Agent:"Ignition/2.9.0" Accept-Header: "application/vnd.coreos.ignition+json;version=3.2.0, */*;q=0.1"
2021-04-20T04:16:32.281495982+00:00 stderr F E0420 04:16:32.281382       1 api.go:136] couldn't get config for req: {worker 0xc0003a0b80}, error: refusing to serve bootstrap configuration to pool "worker"


Trying a simple wget to simulate what the software is doing above, i get a 500 Internal Server Error while resulting in the same message above in the bootstrap log file: 


# wget https://api-int.osc01.nix.mds.xyz:22623/config/worker --no-check-certificate
--2021-04-19 00:53:51--  https://api-int.osc01.nix.mds.xyz:22623/config/worker
Resolving api-int.osc01.nix.mds.xyz (api-int.osc01.nix.mds.xyz)... 192.168.0.70
Connecting to api-int.osc01.nix.mds.xyz (api-int.osc01.nix.mds.xyz)|192.168.0.70|:22623... connected.
WARNING: cannot verify api-int.osc01.nix.mds.xyz's certificate, issued by ‘/OU=openshift/CN=root-ca’:
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 500 Internal Server Error
2021-04-19 00:53:51 ERROR 500: Internal Server Error.


# wget https://api-int.osc01.nix.mds.xyz:22623/config/bootstrap --no-check-certificate
--2021-04-19 00:54:12--  https://api-int.osc01.nix.mds.xyz:22623/config/bootstrap
Resolving api-int.osc01.nix.mds.xyz (api-int.osc01.nix.mds.xyz)... 192.168.0.70
Connecting to api-int.osc01.nix.mds.xyz (api-int.osc01.nix.mds.xyz)|192.168.0.70|:22623... connected.
WARNING: cannot verify api-int.osc01.nix.mds.xyz's certificate, issued by ‘/OU=openshift/CN=root-ca’:
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 500 Internal Server Error
2021-04-19 00:54:12 ERROR 500: Internal Server Error.


Taking a look at what is running on the ports, I see the following:


[root rhbs01 log]# ps -ef|grep -Ei 2804
root        2804    2767  0 04:38 ?        00:00:00 /usr/bin/machine-config-server bootstrap
root        9004    8032  0 04:48 pts/0    00:00:00 grep --color=auto -Ei 2804
[root rhbs01 log]# netstat -pnltu|grep -Ei 2804
tcp6       0      0 :::22623                :::*                    LISTEN      2804/machine-config
tcp6       0      0 :::22624                :::*                    LISTEN      2804/machine-config
[root rhbs01 log]#


It appears fine.  A google search produces the following however reading this post:

https://github.com/openshift/machine-config-operator/blob/master/pkg/server/bootstrap_server.go

Unless it's the master, all other nodes are refused  ( Line 61 ). 

// 3. Load the machine config.

// 4. Append the machine annotations file.

// 5. Append the KubeConfig file.

func (bsc *bootstrapServer) GetConfig(cr poolRequest) (*runtime.RawExtension, error) {

if cr.machineConfigPool != "master" {

return nil, fmt.Errorf("refusing to serve bootstrap configuration to pool %q", cr.machineConfigPool)

}

// 1. Read the Machine Config Pool object.

fileName := path.Join(bsc.serverBaseDir, "machine-pools", cr.machineConfigPool+".yaml")

glog.Infof("reading file %q", fileName)

data, err := ioutil.ReadFile(fileName)

if os.IsNotExist(err) {

glog.Errorf("could not find file: %s", fileName)

return nil, nil

}


So appears I'm at an impasse.  Given the earlier config, how can I install the worker then using UPI if the code above would prevent an install?  Went over the config a few times and unless I'm misinterpreting it somehow, then how do I get past this for a UPI install?


Thanks,


On 3/21/2021 3:08 AM, Jorge Rúa wrote:
Glad it worked, enjoy your shiny new cluster :)

On Sun, Mar 21, 2021, 04:30 TomK <tomkcpr mdevsys com> wrote:
Hi Jorge,

Suggestions worked.  Thanks once more.

For reference, here's what I did, in case it helps others as well.

1) Add a Serial Port to the VM under Virtual Hardware.  Type in the name of the output file where to save the logs.

2) Download the log files from the datastore.  Review and fix any errors.  Example below:


[   11.156393] systemd[1]: Startup finished in 6.800s (kernel) + 0 (initrd) + 4.352s (userspace) = 11.153s.
------
Ignition has failed. Please ensure your config is valid. Note that only
Ignition spec v3.0.0+ configs are accepted.

A CLI validation tool to check this called ignition-validate can be
downloaded from GitHub:
    https://github.com/coreos/ignition/releases
------

Displaying logs from failed units: ignition-fetch-offline.service
-- Logs begin at Sun 2021-03-21 03:07:51 UTC, end at Sun 2021-03-21 03:07:54 UTC. --
Mar 21 03:07:54 ignition[749]: no config URL provided
Mar 21 03:07:54 ignition[749]: reading system config file "/usr/lib/ignition/user.ign"
Mar 21 03:07:54 ignition[749]: no config at "/usr/lib/ignition/user.ign"
Mar 21 03:07:54 ignition[749]: [0;2;37m [0;1;31m [0;2;37mconfig successfully fetched [0m
Mar 21 03:07:54 ignition[749]: [0;2;37m [0;1;31m [0;2;37mparsing config with SHA512: b71f59139d6c3101031fd0cee073e0503f233c47129db8597462687a608ae0a4b594bf9c170ce55dbd289d4be2638f68e4d39c9b2f50c81f956d5bca24955959 [0m
Mar 21 03:07:54 systemd[1]: ignition-fetch-offline.service: Triggering > Mar 21 03:07:54 ignition[749]: [0;1;31m [0;1;39m [0;1;31merror at line 7 col 5: invalid character ']' after object key:value pair [0m
Mar 21 03:07:54 ignition[749]: [0;1;39m [0;1;31m [0;1;39mfailed to fetch config: config is not valid [0m
Mar 21 03:07:54 ignition[749]: [0;1;31m [0;1;39m [0;1;31mfailed to acquire config: config is not valid [0m
Mar 21 03:07:54 ignition[749]: [0;1;31m [0;1;39m [0;1;31mIgnition failed: config is not valid [0m
Press Enter for emergency shell or wait 5 minutes for reboot.                
Press Enter for emergency shell or wait 4 minutes 45 seconds for reboot.     


Once fixed and booted, fix any key issues:


# ssh core 192 168 0 105
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:+PaPjXcO/gOaen9+fHfI1q7s7XQgaczHXUWm6Gtf56E.
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ED25519 key in /var/lib/sss/pubconf/known_hosts:18
ECDSA host key for 192.168.0.105 has changed and you have requested strict checking.
Host key verification failed.

# ssh-keyscan -t ecdsa 192.168.0.105 >> ~/.ssh/known_hosts



And login using the previously generated SSH key:


# -i ../../.ssh/id_rsa-os01  core 192 168 0 105
Red Hat Enterprise Linux CoreOS 47.83.202102090044-0
  Part of OpenShift 4.7, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.7/architecture/architecture-rhcos.html

---
This is the bootstrap node; it will be destroyed when the master is fully up.

The primary services are release-image.service followed by bootkube.service. To watch their status, run e.g.

  journalctl -b -f -u release-image.service -u bootkube.service
[core bootstrap01 ~]$


TUVM!




On 3/18/2021 12:59 PM, TomK wrote:
Hi Jorge,

I might have to use a web server for the vApp option as you pointed out. It did allow me to paste the config in however I'm wondering if it didn't truncate anything once saved.

Correct. UPI.   I'll try and muddle through the UPI install a few more times.  I can redirect the console output to a file and check it out.  When you're saying dmesg, I think you mean just the boot output.  Unless you're implying there is a way to login?   

If this is still an issue, I'll try the IPI method. 

Thanks,
Tom

On 3/18/2021 7:15 AM, Jorge Rúa wrote:
Hi Tom,

Yeah, troubleshooting ignition issues can be overwhelming. Either they're set correctly or not work at all, the only method to troubleshoot those problems are by inspecting dmesg and boot logs upon poweron. 
Ignition files are somehow picky about format, also there's size limitations on vApp properties and in some cases you're referencing ignition files hosted in a separate web server. So check network, dns resolution, dhcp, firewalling, etc. and inspect boot messages over serial console (no network at this stage)

On the other hand I see you're using UPI method, right? Have you tried using using IPI method instead? I speak from memory but AFAIK vSphere installation using IPI is supported since OCP 4.5 version, and it's a way more pleasant installation from
user-perspective. Of course using UPI brings you the full control over the process, but I'd consider IPI too.

Hope that helps,

Regards


El jue, 18 mar 2021 a las 5:31, TomK (<tomkcpr mdevsys com>) escribió:
Hey Jorge,

Thanks.  Yup I did that prior to posting, at least how I interpreted the doc.  Not working however, hence the email.  In other words, I've used settings Advanced -> Configuration Parameters:

guestinfo.afterburn.initrd.network-kargs

to

ip=10.0.0.101::10.0.0.1:255.255.255.0:bootstrap01.osc01.my.dom::none nameserver=192.168.0.10 nameserver=192.168.0.11 nameserver=192.168.0.12

+ setting guestinfo.ignition.config.data, guestinfo.ignition.config.data.encoding and disk.EnableUUID.  No luck.

How to troubleshoot with these machines?  In other words, how do I see if what I set could be wrong and the process is failing?  I'm reading that these can't be logged into other then via an ssh key.  However, logging in via ssh will only work if the host is on the network, which it isn't since the above isn't working. 

I know the network parameters work since I have another provisioning method for RHEL machines using vApp options and an image script that just needs a hostname to get the machine on the network w/ unique IP's discovery off the VLAN's.  Tested all the VLAN's using these standard RHEL OS builds and no issues getting any on any of the VLAN's. 

Thanks,


On 3/17/2021 3:24 AM, Jorge Rúa wrote:
Hello,

Obviously the export command is an example of how you can then pass it to govc to inject guestinfo.afterburn.initrd.network-kargs property into the VM.

So can do that manually by navigating the vcenter ui and adding the property as you wish, or, as the example suggests, exporting a variable locally with the ipcfg you need to pass to govc. 

Hope this helps,

On Wed, Mar 17, 2021, 06:01 TomK <tomkcpr mdevsys com> wrote:
Hey Everyone,

Following this page to install OpenShift:

https://docs.openshift.com/container-platform/4.7/installing/installing_vsphere/installing-vsphere.html#installing-vsphere

Optional step:

"On the Customize hardware tab, click VM Options → Advanced."

Where do I add the following?

export IPCFG="ip=<ip>::<gateway>:<netmask>:<hostname>:<iface>:none
nameserver=srv1 [nameserver=srv2 [nameserver=srv3 [...]]]"

The page indicates:

"Optional: Override default DHCP networking in vSphere. To enable static
IP networking:"

There isn't anything on the VM options tab that makes sense for those
instructions.  Could someone help identify what I'm missing here?

--
Thx,
TK.

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


--
Thx,
TK.


--
Thx,
TK.

_______________________________________________
users mailing list
users lists openshift redhat com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


--
Thx,
TK.


--
Thx,
TK.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]